Pooja D

Pooja D Email and Phone Number

Senior Data Engineer|10+ years in FinTech, E-Commerce, Big Data, Cloud Technologies | Expert in Building Scalable Data Pipelines | Expert in Python, SQL, Hadoop, Spark, AWS & Azure Specialist @ EMC Insurance Companies
des moines, iowa, united states
Pooja D's Location
United States, United States
About Pooja D

With over 10 years of comprehensive IT experience, I specialize in the Analysis, Design, and Development of Big Data systems, web applications, and data warehousing solutions. My expertise spans a variety of big data technologies, including Hadoop, Spark, Kafka, and Snowflake, coupled with extensive hands-on experience with AWS and Azure cloud services.I have a proven track record in developing Spark applications for data extraction, transformation, and aggregation using Databricks, and I am proficient in utilizing AWS services such as EC2, S3, EMR, Glue, and Redshift for building robust data pipelines. My experience extends to Azure Big Data technologies, where I have successfully implemented data orchestration and incremental loading with Azure Data Factory and Azure Synapse.I excel in agile environments, adept at using Scrum methodologies to deliver efficient solutions. My technical skills include programming in Python, Scala, and Java, with a deep understanding of RDD transformations and actions. I am also experienced in handling SQL and NoSQL databases, developing automated data ingestion modules, and optimizing ETL workflows.Additionally, I have worked with various ETL tools like Informatica, Talend, and DataStage, ensuring high data quality and integrity. My capabilities in data visualization and reporting are highlighted by my work with Tableau and Power BI.Skilled in CI/CD pipelines, I leverage Jenkins, GitLab, Helm, and Kubernetes for seamless build and deployment automation. My strong background in data architecture, combined with my problem-solving skills, enables me to deliver scalable and efficient data solutions.

Pooja D's Current Company Details
EMC Insurance Companies

Emc Insurance Companies

View
Senior Data Engineer|10+ years in FinTech, E-Commerce, Big Data, Cloud Technologies | Expert in Building Scalable Data Pipelines | Expert in Python, SQL, Hadoop, Spark, AWS & Azure Specialist
des moines, iowa, united states
Website:
emcins.com
Employees:
1725
Pooja D Work Experience Details
  • Emc Insurance Companies
    Senior Data Engineer
    Emc Insurance Companies May 2023 - Present
    As a Senior Data Engineer at EMC Insurance, I lead the development and management of complex data pipelines and ETL processes to facilitate data ingestion and transformation. Led a cross-functional team of 5 engineers, managing the end-to-end project lifecycle from requirement gathering to implementation, ensuring deliverables met strict deadlines and quality standards. My role involves gathering business requirements, translating them into clear specifications, and leveraging tools like Spark, Databricks, and Informatica to extract, transform, and aggregate data from various sources. I am responsible for estimating, monitoring, and troubleshooting Spark Databricks clusters and have successfully implemented the Databricks Unity Catalog for centralized metadata.I utilize AWS services, including EC2, S3, EMR, Glue, and Redshift, to perform data operations, normalization, and storage. Additionally, I have developed Spark applications using Python, handled data from RDBMS and streaming sources, and improved Hadoop algorithms' performance with Spark Streaming APIs. My expertise extends to Azure Big Data technologies, where I have executed POCs and managed data orchestration using Azure Data Factory and Synapse.I have implemented real-time data processing jobs using Kafka and Spark Streaming, developed pre-processing jobs with Spark Data Frames, and managed data pipelines with AWS Glue and PySpark. My role also involves maintaining Hadoop clusters on AWS EMR and migrating on-premises applications to AWS.Furthermore, I have experience with Snowflake, having configured Snowpipe and managed data in its staging area. My work includes performing data analysis using Hive, developing custom UDFs in Python, and integrating various data formats with PIG. I am well-versed in CI/CD pipelines, utilizing Jenkins, GitLab, Helm, and Kubernetes. I operate in an Agile development environment, employing SCRUM methodologies to deliver efficient and scalable data solutions.
  • Amway
    Senior Data Engineer
    Amway Jul 2021 - Apr 2023
    As a Senior Data Engineer at Amway Corp, I was instrumental in designing and deploying data pipelines using Azure Data Lake, Databricks, and Apache Airflow. I integrated data from both on-premises and cloud sources with Azure Data Factory, performing complex transformations and loading data into Azure Synapse. I developed Spark Scala functions and streaming applications to process real-time data, leveraging Spark Data Frames and UDFs in Databricks for large-scale transformations. I led the migration of key systems to Azure Cloud Services, enhancing data ingestion on HDInsight Spark clusters. My role also involved configuring and managing Kubernetes clusters, and implementing CI/CD processes with Azure DevOps. I engineered custom-built adapters for data ingestion from Snowflake, MS SQL, and MongoDB, and implemented real-time data streaming from Apache Kafka to HDFS. Additionally, I developed ETL pipelines and DAG workflows using Python, Airflow, and Apache NiFi, and created interactive Power BI dashboards and reports to provide valuable business insights. Throughout my tenure, I employed Agile methodologies and utilized tools like JIRA and Bitbucket for efficient project management and version control.
  • Nationwide
    Data Engineer
    Nationwide Nov 2018 - Jun 2021
    At Nationwide, I specialized in creating comprehensive data models using ERwin and developed Python scripts to automate data sampling, ensuring data integrity and consistency. My expertise in AWS allowed me to define and deploy robust monitoring, metrics, and logging systems. I optimized SQL queries and tuned the Redshift environment to significantly enhance performance. I developed SSRS reports and SSIS packages for ETL processes and worked with Big Data technologies on AWS, including EC2, S3, EMR, and DynamoDB. I managed security groups on AWS, emphasizing high availability and auto-scaling with Terraform, and implemented CI/CD pipelines using AWS Lambda and AWS CodePipeline. Additionally, I developed code for exception handling in Kafka, automated workflows with Oozie, and created ad hoc queries and reports using SQL Server Reporting Services. My role also involved using Hive SQL, Presto SQL, and Spark SQL for ETL jobs and publishing interactive data visualizations with Tableau and SAS Visual Analytics.
  • Grapesoft Solutions
    Data Engineer/ Hadoop Engineer
    Grapesoft Solutions Nov 2016 - Jul 2018
    Hyderabad, Telangana, India
    During my tenure at Grapesoft Solutions, I excelled in developing and designing applications to process data using Spark and implemented efficient data access techniques in HIVE, including Partitioning, Dynamic Partitions, and Buckets. I created and optimized Map-Reduce jobs using Hive, Pig, and Java for processing large datasets. My responsibilities included managing Hadoop jobs for text data processing, developing job scheduler scripts for data migration, and installing and configuring a multi-node Hadoop cluster. I utilized Azure Data Factory for on-cloud ETL processing, created data bricks notebooks using PySpark, Scala, and Spark SQL, and performed data reformation post-extraction from HDFS. I automated tasks with Oozie, imported and exported data using Sqoop, and developed fully parameterized ADF pipelines for efficient code management. Additionally, I engaged in ad-hoc data analyses using Azure Data Bricks and created pipelines in ADF for data transformation and loading from various sources. My role also involved developing Pig Latin scripts, implementing Map Reduce Jobs for data cleansing, and managing RDBMS data transfer to HDFS for processing.
  • Maisa Solutions, Inc.
    Data Analyst
    Maisa Solutions, Inc. Jun 2014 - Oct 2016
    Hyderabad, Telangana, India
    At Maisa Solutions, I honed my skills as a Data Analyst by leveraging technologies such as MySQL and Excel PowerPivot to query test data and meet end-user requirements. I optimized SQL queries to enhance performance and eliminate data discrepancies. My role involved collecting data in near-real-time from AWS S3 buckets using Spark Streaming, performing necessary transformations and aggregations, and persisting the data in HDFS. I applied dimensional data modeling techniques and ETL storyboarding, and analyzed data by writing custom MySQL queries for better performance. Additionally, I built simulation applications using HTML, CSS, JavaScript, and Bootstrap, performed data analysis and profiling, and developed weekly reports in collaboration with business analysts using Crystal Reports. I developed ETL procedures using Informatica Power Center to ensure data conformity and compliance, re-engineered existing ETL processes for improved performance, and automated workflows with Python scripts and Unix shell scripting.

Pooja D Education Details

Frequently Asked Questions about Pooja D

What company does Pooja D work for?

Pooja D works for Emc Insurance Companies

What is Pooja D's role at the current company?

Pooja D's current role is Senior Data Engineer|10+ years in FinTech, E-Commerce, Big Data, Cloud Technologies | Expert in Building Scalable Data Pipelines | Expert in Python, SQL, Hadoop, Spark, AWS & Azure Specialist.

What schools did Pooja D attend?

Pooja D attended Jntuh College Of Engineering Hyderabad.

Who are Pooja D's colleagues?

Pooja D's colleagues are David Oberkrieser, Steve Gable, Brian Arnold, Kevin Gute, Julie Burger, Kelly Phillips, Tiffany C..

Not the Pooja D you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.