Senior Hadoop Developer
Current- Developed and deployed the project using AWS EC2, EMR, Glue, S3, Lambda, CloudFormation, Elastic Beanstalk, Cloud watch, Elastic search, DMS, SQS, SNS and Amazon Kinesis services to process and store data in Snowflake.
- Created automated Databricks workflow notebooks in Python to orchestrate multiple data loads efficiently and Delta Lake tables for metadata storage.
- Worked on different types of applications/jobs using PySpark to integrate the data coming from other sources and processed using the Spark data pipelines.
- Optimized BigQuery SQL queries by selecting appropriate distribution styles and keys for enhanced query performance.
- Involved in developing Spark applications using Spark-SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for Analyzing & transforming the data to uncover insights into the.
- Developed Big Query SQL queries with a set of applicable parameters to load data from the HIVE/Presto Stage into the actual HIVE/Presto Target table, often facilitated through Google Cloud SQL.