Pooja D Email and Phone Number
With over 10 years of comprehensive IT experience, I specialize in the Analysis, Design, and Development of Big Data systems, web applications, and data warehousing solutions. My expertise spans a variety of big data technologies, including Hadoop, Spark, Kafka, and Snowflake, coupled with extensive hands-on experience with AWS and Azure cloud services.I have a proven track record in developing Spark applications for data extraction, transformation, and aggregation using Databricks, and I am proficient in utilizing AWS services such as EC2, S3, EMR, Glue, and Redshift for building robust data pipelines. My experience extends to Azure Big Data technologies, where I have successfully implemented data orchestration and incremental loading with Azure Data Factory and Azure Synapse.I excel in agile environments, adept at using Scrum methodologies to deliver efficient solutions. My technical skills include programming in Python, Scala, and Java, with a deep understanding of RDD transformations and actions. I am also experienced in handling SQL and NoSQL databases, developing automated data ingestion modules, and optimizing ETL workflows.Additionally, I have worked with various ETL tools like Informatica, Talend, and DataStage, ensuring high data quality and integrity. My capabilities in data visualization and reporting are highlighted by my work with Tableau and Power BI.Skilled in CI/CD pipelines, I leverage Jenkins, GitLab, Helm, and Kubernetes for seamless build and deployment automation. My strong background in data architecture, combined with my problem-solving skills, enables me to deliver scalable and efficient data solutions.
Emc Insurance Companies
View- Website:
- emcins.com
- Employees:
- 1725
-
Senior Data EngineerEmc Insurance Companies May 2023 - PresentAs a Senior Data Engineer at EMC Insurance, I lead the development and management of complex data pipelines and ETL processes to facilitate data ingestion and transformation. Led a cross-functional team of 5 engineers, managing the end-to-end project lifecycle from requirement gathering to implementation, ensuring deliverables met strict deadlines and quality standards. My role involves gathering business requirements, translating them into clear specifications, and leveraging tools like Spark, Databricks, and Informatica to extract, transform, and aggregate data from various sources. I am responsible for estimating, monitoring, and troubleshooting Spark Databricks clusters and have successfully implemented the Databricks Unity Catalog for centralized metadata.I utilize AWS services, including EC2, S3, EMR, Glue, and Redshift, to perform data operations, normalization, and storage. Additionally, I have developed Spark applications using Python, handled data from RDBMS and streaming sources, and improved Hadoop algorithms' performance with Spark Streaming APIs. My expertise extends to Azure Big Data technologies, where I have executed POCs and managed data orchestration using Azure Data Factory and Synapse.I have implemented real-time data processing jobs using Kafka and Spark Streaming, developed pre-processing jobs with Spark Data Frames, and managed data pipelines with AWS Glue and PySpark. My role also involves maintaining Hadoop clusters on AWS EMR and migrating on-premises applications to AWS.Furthermore, I have experience with Snowflake, having configured Snowpipe and managed data in its staging area. My work includes performing data analysis using Hive, developing custom UDFs in Python, and integrating various data formats with PIG. I am well-versed in CI/CD pipelines, utilizing Jenkins, GitLab, Helm, and Kubernetes. I operate in an Agile development environment, employing SCRUM methodologies to deliver efficient and scalable data solutions. -
Senior Data EngineerAmway Jul 2021 - Apr 2023As a Senior Data Engineer at Amway Corp, I was instrumental in designing and deploying data pipelines using Azure Data Lake, Databricks, and Apache Airflow. I integrated data from both on-premises and cloud sources with Azure Data Factory, performing complex transformations and loading data into Azure Synapse. I developed Spark Scala functions and streaming applications to process real-time data, leveraging Spark Data Frames and UDFs in Databricks for large-scale transformations. I led the migration of key systems to Azure Cloud Services, enhancing data ingestion on HDInsight Spark clusters. My role also involved configuring and managing Kubernetes clusters, and implementing CI/CD processes with Azure DevOps. I engineered custom-built adapters for data ingestion from Snowflake, MS SQL, and MongoDB, and implemented real-time data streaming from Apache Kafka to HDFS. Additionally, I developed ETL pipelines and DAG workflows using Python, Airflow, and Apache NiFi, and created interactive Power BI dashboards and reports to provide valuable business insights. Throughout my tenure, I employed Agile methodologies and utilized tools like JIRA and Bitbucket for efficient project management and version control. -
Data EngineerNationwide Nov 2018 - Jun 2021At Nationwide, I specialized in creating comprehensive data models using ERwin and developed Python scripts to automate data sampling, ensuring data integrity and consistency. My expertise in AWS allowed me to define and deploy robust monitoring, metrics, and logging systems. I optimized SQL queries and tuned the Redshift environment to significantly enhance performance. I developed SSRS reports and SSIS packages for ETL processes and worked with Big Data technologies on AWS, including EC2, S3, EMR, and DynamoDB. I managed security groups on AWS, emphasizing high availability and auto-scaling with Terraform, and implemented CI/CD pipelines using AWS Lambda and AWS CodePipeline. Additionally, I developed code for exception handling in Kafka, automated workflows with Oozie, and created ad hoc queries and reports using SQL Server Reporting Services. My role also involved using Hive SQL, Presto SQL, and Spark SQL for ETL jobs and publishing interactive data visualizations with Tableau and SAS Visual Analytics. -
Data Engineer/ Hadoop EngineerGrapesoft Solutions Nov 2016 - Jul 2018Hyderabad, Telangana, IndiaDuring my tenure at Grapesoft Solutions, I excelled in developing and designing applications to process data using Spark and implemented efficient data access techniques in HIVE, including Partitioning, Dynamic Partitions, and Buckets. I created and optimized Map-Reduce jobs using Hive, Pig, and Java for processing large datasets. My responsibilities included managing Hadoop jobs for text data processing, developing job scheduler scripts for data migration, and installing and configuring a multi-node Hadoop cluster. I utilized Azure Data Factory for on-cloud ETL processing, created data bricks notebooks using PySpark, Scala, and Spark SQL, and performed data reformation post-extraction from HDFS. I automated tasks with Oozie, imported and exported data using Sqoop, and developed fully parameterized ADF pipelines for efficient code management. Additionally, I engaged in ad-hoc data analyses using Azure Data Bricks and created pipelines in ADF for data transformation and loading from various sources. My role also involved developing Pig Latin scripts, implementing Map Reduce Jobs for data cleansing, and managing RDBMS data transfer to HDFS for processing.
-
Data AnalystMaisa Solutions, Inc. Jun 2014 - Oct 2016Hyderabad, Telangana, IndiaAt Maisa Solutions, I honed my skills as a Data Analyst by leveraging technologies such as MySQL and Excel PowerPivot to query test data and meet end-user requirements. I optimized SQL queries to enhance performance and eliminate data discrepancies. My role involved collecting data in near-real-time from AWS S3 buckets using Spark Streaming, performing necessary transformations and aggregations, and persisting the data in HDFS. I applied dimensional data modeling techniques and ETL storyboarding, and analyzed data by writing custom MySQL queries for better performance. Additionally, I built simulation applications using HTML, CSS, JavaScript, and Bootstrap, performed data analysis and profiling, and developed weekly reports in collaboration with business analysts using Crystal Reports. I developed ETL procedures using Informatica Power Center to ensure data conformity and compliance, re-engineered existing ETL processes for improved performance, and automated workflows with Python scripts and Unix shell scripting.
Pooja D Education Details
-
Electrical, Electronic And Communications Engineering Technology/Technician
Frequently Asked Questions about Pooja D
What company does Pooja D work for?
Pooja D works for Emc Insurance Companies
What is Pooja D's role at the current company?
Pooja D's current role is Senior Data Engineer|10+ years in FinTech, E-Commerce, Big Data, Cloud Technologies | Expert in Building Scalable Data Pipelines | Expert in Python, SQL, Hadoop, Spark, AWS & Azure Specialist.
What schools did Pooja D attend?
Pooja D attended Jntuh College Of Engineering Hyderabad.
Who are Pooja D's colleagues?
Pooja D's colleagues are David Oberkrieser, Steve Gable, Brian Arnold, Kevin Gute, Julie Burger, Kelly Phillips, Tiffany C..
Not the Pooja D you were looking for?
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial