Project Engineer
Chennai, Tamil Nadu, India
- Spearheaded end-to-end data ingestion and validation for client-provided CSVs using AWS S3, Glue, and RDS, ensuring accurate and efficient data processing.
- Developed PySpark-based ETL pipelines, transforming and optimizing data from raw to processed layers, and utilizing complex SQL for staging-level transformations.
- Migrated multiple data pipelines to AWS Redshift using Airflow on EC2 instances, improving data accessibility and retrieval times for business-critical insights.
- Led batch data migration with AWS S3, Glue, RDS, EMR, and Redshift, streamlining storage, transformation, and processing across cloud environments.
- Executed historical data migration from Netezza to AWS, migrating over 100 tables using AWS DMS and automating schema conversions with AWS SCT for seamless transitions.
- Enhanced cloud architecture efficiency by leveraging AWS services, optimizing costs, and improving the performance of large-scale data workflows.