Lead Data Engineer
- Led the design and development of a Big Data pipeline for Lowe's Network Flowcast Model (NFC) to predict storage and network volume metrics.
- Architected and implemented the data ingestion process using AWS Glue to seamlessly integrate data from diverse sources.
- Utilized Spark, Tez, and MR processing engines within EMR clusters (managed via Boto3 scripts) to efficiently transform and analyze complex data at scale.
- Developed Airflow workflows to orchestrate data movement and processing tasks, ensuring timely data availability for the NFC model.
- Built robust data pipelines to ingest Kafka streams and JSON files (containing intricate data structures) into Hive tables, optimized for efficient querying.
- Implemented business logic using Spark, Tez, and MR to transform data in Hive tables, enabling comprehensive data analysis for the business team.