Data Engineer
Current- Developed a dbt-powered data cleaning pipeline in Azure Synapse, improving data accuracy by 30% through type conformance, multi-source joins, and advanced Python cleaning scripts, resulting in unified datasets that.
- Built and optimized real-time data pipelines using Azure Data Factory, reducing query times by 40% and increasing analytics throughput by 35% through seamless integration of On-Prem SQL Server with ADLS Gen 2 and Azure.
- Built and optimized real-time data pipelines using Azure Data Factory, reducing query times by 40% and increasing analytics throughput by 35% through seamless integration of On-Prem SQL Server with ADLS Gen 2 and Azure.
- Architected and deployed AWS RDS and S3-based data lake, boosting data retrieval speeds by 25% and reducing storage costs by 15%, resulting in more efficient support for analytics projects across teams.
- Engineered automated Apache Airflow pipelines, increasing data processing efficiency by 25%, cutting processing times by 50%, and expanding advanced analytics capabilities across the organization.
- Led the development of adaptive web scraping solutions with BeautifulSoup4, Scrapy, and Selenium, extracting 100,000+ data points monthly with 99% accuracy, efficiently stored in AWS RDS, enhancing data accessibility.