Data Engineer
Current- Developed and optimized large-scale data processing workflows using PySpark, resulting in a 40% improvement in processing speed and efficiency for big data analytics projects. Collaborated closely with subject matter.
- Developed reusable ADF pipelines, Databricks notebooks incorporating various ETL&ELT transformations to streamline data processing and enhance reusability across projects.
- Created, provisioned different Databricks clusters needed for batch and continuous streaming data processing and installed the required libraries for the clusters.
- Utilized Azure Logic Apps to build workflows to schedule and automate batch jobs by integrating apps, ADF pipelines, and other services like HTTP requests, email triggers, etc.
- Implemented CI/CD process in new and existing ADF environment while making credentials secure in Key Vault, and deployed production ready ADF resources using pipeline releases.
- Responsible for developing ETL&ELT pipelines to meet business use cases by using ADF data flows, Pipelines and Synapses, DataLake, and automate pipelines using various ADF triggers.