I have 8 years of expertise in developing and managing data pipelines and data products to ingest, process, and analyze a sizable volume of structured and unstructured data from multiple sources. I am a skilled data engineer. I have a lot of expertise assessing data requirements, moving data into enterprise data lakes, and developing data products and reports. I am well-versed in large data technologies and distributed processing frameworks, and I have successfully led the delivery of real-time and batch-based ETL pipelines.
-
Big Data DeveloperAmerican Express Nov 2022 - PresentNew York, Ny, Us• Developed centralized Data Lakes on AWS using S3, EMR, Redshift, and Athena.• Built Spark applications, Hive scripts, and real-time streaming pipelines with Kafka.• Automated infrastructure with AWS CloudFormation, Terraform, and CI/CD tools. -
Big Data DeveloperGreat American Insurance Group May 2021 - Nov 2022Cincinnati, Oh, Us• Managed the development and maintenance of data pipelines in a large-scale distributed environment.• Delivered both batch and real-time data pipelines using frameworks like Spark.• Collaborated with cross-functional teams to optimize data processing efficiency. -
Big Data DeveloperCiti Bank Jan 2020 - Apr 2021• Integrated multiple sourced data from ftp servers, oracle databases, teradata, and external APIs into a central data lake using tools like swoop, PySpark, Scala, and Kafka.• Designed Spark applications to prepare, aggregate, and clean data for machine learning and reporting purposes.• Analyzed user activities and customer profiles to generate valuable analytics and reports for Citi Bank in Texas, United States. -
Hadoop DeveloperDish Network Apr 2016 - Dec 2019Englewood, Co, Us• Developed and implemented Spark transformations and actions using Spark RDDs and Data frames to analyze customer and sales data• Utilized Kafka to stream data from various sources into HDFS for analysis• Collaborated with cross-functional teams to ensure data quality and availability• Managed large datasets using Panda data frames and MySQL -
Hadoop DeveloperHewlett Packard Enterprise Dec 2015 - Jul 2016Houston, Texas, Us• Built scalable distributed data solutions using Hadoop, managing cluster maintenance and troubleshooting• Analyzed data using Hadoop components Hive and Pig, transforming large sets of structured and unstructured data• Imported and exported data from RDBMS to HDFS using Sqoop, creating Hive tables and writing queries -
Java DeveloperCyient Jan 2013 - Dec 2014Hyderabad, Ts, In• Developed complex MapReduce jobs in Java for data extraction, aggregation, and transformation• Integrated hive warehouse with HBase for information sharing among teams• Scripted complex Hive QL queries for analytical functions• Developed Pig Latin scripts for data joins and custom processing of Map Reduce outputs
Pavan P Education Details
-
New England CollegeComputer And Information Systems Security/Information Assurance
Frequently Asked Questions about Pavan P
What company does Pavan P work for?
Pavan P works for American Express
What is Pavan P's role at the current company?
Pavan P's current role is Big Data Developer @ American Express.
What schools did Pavan P attend?
Pavan P attended New England College.
Free Chrome Extension
Find emails, phones & company data instantly
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial