Divya P

Divya P Email and Phone Number

Actively Looking for Data Engineer @ Berkley Medical Management Solutions (a Berkley Company)
Divya P's Location
Frisco, Texas, United States, United States
About Divya P

Divya P is a Actively Looking for Data Engineer at Berkley Medical Management Solutions (a Berkley Company).

Divya P's Current Company Details
Berkley Medical Management Solutions (a Berkley Company)

Berkley Medical Management Solutions (A Berkley Company)

View
Actively Looking for Data Engineer
Divya P Work Experience Details
  • Berkley Medical Management Solutions (A Berkley Company)
    Sr Data Engineer
    Berkley Medical Management Solutions (A Berkley Company) Jan 2021 - Present
    Boston, Ma, Us
    • Build scalable databases capable of ETL processes using SQL and spark.• Used Spark JDBC drivers to ingest data from variety of Relational Databases into HDFS.• Convert raw data with sequence data format, such as Avro and Parquet to reduce data processing time and increase data transferring efficiency through the network.• Worked on Normalization and De-normalization techniques for optimum performance in relational and dimensional databases environments.• Expertise in moving data between GCP and Azure using Azure Data Factory.• Utilized AWS services like EMR, S3, Glue Metastore and Athena extensively for building the data applications. • Evaluate Snowflake Design considerations for any change in the application.• Designed developed and tested Extract Transform Load (ETL) applications with different types of sources.• Creating files and tuned the SQL queries in Hive Utilizing HUE. Implemented MapReduce jobs in Hive by querying the available data. • Used AWS Athena extensively to ingest structured data from S3 into other systems such as Redshift or to produce reports. • Performed various types of tuning techniques like Partitioning, Caching, Broadcasting and bucketing to improve spark jobs.
  • Comcast
    Data Engineer
    Comcast Nov 2019 - Dec 2020
    Philadelphia, Pa, Us
    • Collected data from various sources like customer transaction database, Mongo DB, Azure Blob Storage, MS SQL server to convert data into analysis format to find the fraud transactions, customer churn• Import and export data from different sources into HDFS for further processing and vice versa using Apache Sqoop.• Develop and deploy the outcome using spark and Scala code in Hadoop cluster running on GCP.• Developed Python scripts to do file validations in Databricks and automated the process using ADF• Implemented data pipelines in Azure Data Factory to extract, transform and load data from multiple sources like Azure SQL, Blob storage and Azure SQL Data warehouse.• Design, develop, and test dimensional data models using Star and Snowflakes chema methodologies under teh Kimball method.• Build data pipelines in airflow in GCP for ETL related jobs using different airflow operators both old and newer operators.• Worked on migration of data from On-perm SQL server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB).• Worked extensively on migrating/rewriting existing Oozie jobs to AWS simple workflow.• Worked with Azure BLOB and Data lake storage and loading data into Azure SQL Synapse analytics (DW).• Deployed the data pipeline in Azure Data Factory using JSON scripts to process the data• Extensively used the Azure Service like Azure Data Factory and Logic App for ETL, to push in/out the data from DB to Blob storage, HDInsight - HDFS, Hive Tables• Extract, Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, Azure Data Lake Analytics• Extracted and loaded data into Data lake with ETL jobs and developed shell scripts for dynamic partitions adding to Hive stage.• Involved in code migration of quality monitoring tool from AWS EC2 to AWS Lambda and built logical datasets to administer quality monitoring on snowflake warehouses.
  • Macy'S
    Big Data Developer
    Macy'S Sep 2018 - Oct 2019
    New York, Ny, Us
    • Participate in Designing and Developing Big Data application architecture for data processing and analyzing data as per business requirements and use cases.• Analyzing existing application architecture to suggest and implement changes for optimizing the solutions for problems statements.• Recommending new Big data tools and solutions and developing POC (proof of Concept) to resolve issues and problems observed with the existing method or process of application implementation.• Programming and coding with Big Data technologies and tools like Apache Spark, Apache Hive, Apache Pig, Apache Sqoop, Apache Storm, Apache Kafka etc.• Data Engineer responsibilities, work involving in preparing, transforming, processing, analyzing, presentation of large amount of data using Hadoop ecosystem tools and components.• Working with Spark RDDs, Data Frames, Dataset’s to implement iterative algorithm and interactive querying and leverage spark In-memory computation logic for fast processing of data.• Working with different file formats like Text, Sequential, Avro, ORC and Parquet and compression libraries like Snappy, Gzip2, Lzo etc. to identify best compression and serialization format depending on the type of data for efficient storage and processing.• Writing Hive and Pig scripts for data pre-processing, cleansing and transformations. Extending the core functionality of Hive and Pig by writing custom Python, Scala, Java UDFs (User defined functions) to use on top of it.• Perform Ingestion of structured, unstructured and semi structured data from multiple data sources in to Hadoop Distributed environment using Apache Sqoop and loading the data in to Hive tables, HBase tables after preprocessing.
  • Igate Global Solutions
    Hadoop Developer
    Igate Global Solutions May 2016 - Aug 2017
    • Designed and built terabyte, full end-to-end Data Warehouse infrastructure from the ground up on Redshift for large scale data handling Millions of records• Developed workflows in Oozie for business requirements to extract the data using Sqoop.• For data exploration stage used Hive to get important insights about the processed data from HDFS• Expertise knowledge in Hive SQL, Presto SQL and Spark SQL for ETL jobs and using the right technology for the job to get done• Responsible for ETL and data validation using SQL Server Integration Services• Designed and Developed ETL jobs to extract data from Salesforce replica and load it in data mart in Redshift• Involved in Data Extraction from Oracle and Flat Files using SQL Loader Designed and developed mappings using Informatica• Developed PL/SQL procedures/packages to kick off the SQL Loader control files/procedures to load the data into Oracle• Build and maintain complex SQL queries for data analysis, data mining and data manipulation• Developed Matrix and tabular reports by grouping and sorting rows• Actively participated in weekly meetings with the technical teams to review the code• Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users• Developed schemas to handle reporting requirements using Tableau.
  • Incresol
    Data Warehouse Developer
    Incresol Jul 2013 - Apr 2016
    Aubrey, Tx, Us
    • Assisting the team with performance tuning for ETL and database processes• Design, develop, implement, and assist in validating processes• Self-manage time and task priorities and of other developers on the project• Work with data providers to fill data gaps and/or to adjust source-system data structures to facilitate analysis and integration with other company data• Develop mapping / sessions / workflows• Conduct ETL performance tuning, troubleshooting, support, and capacity estimation• Map sources to targets using a variety of tools, including Business Objects Data Services/BODI. Design and develop ETL code to load and transform the source data from various formats into a SQL database.• Worked extensively on different types of transformations like source qualifier, expression, filter, aggregator, rank, lookup, stored procedure, sequence generator and joiner.• Created, launched & scheduled tasks/sessions. Configured email notification. Setting up tasks to schedule the loads at required frequency using Power Center Server manager. Generated completion messages and status reports using Server manager.• Administrated Informatica server ran Sessions & Batches.• Developed shell scripts for automation of Informatica session loads.• Involved in the performance tuning of Informatica servers.

Divya P Education Details

  • Gayatri Vidya Parishad College Of Engineering (Autonomous)
    Gayatri Vidya Parishad College Of Engineering (Autonomous)
    Electronics And Communications Engineering

Frequently Asked Questions about Divya P

What company does Divya P work for?

Divya P works for Berkley Medical Management Solutions (A Berkley Company)

What is Divya P's role at the current company?

Divya P's current role is Actively Looking for Data Engineer.

What schools did Divya P attend?

Divya P attended Gayatri Vidya Parishad College Of Engineering (Autonomous).

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.