Karthik P

Karthik P Email and Phone Number

Senior Big Data Engineer @ UPS
United States
Karthik P's Location
United States, United States
About Karthik P

Hello, My name is Karthik and I am a seasoned Senior Data Engineer with 11+ years of experience in development, design, integration, and presentation using Java and extensive expertise in Big Data and Hadoop ecosystems. Proficient in tools such as Hive, Pig, Flume, Sqoop, Zookeeper, Spark, Kafka, Snowflake, Python, HUDI, CDC, and AWS, successfully implemented numerous big data projects on platforms like Cloudera Horton Works and AWS. Has a proven track record of managing and optimizing Hadoop clusters, implementing GCP DLP policies, and developing ETL pipelines with Spark and Scala. His skills extend to cloud computing, including GCP and AWS, and NoSQL databases such as HBase, Cassandra, and MongoDB. Skilled in implementing HUDI for efficient data ingestion and real-time processing, ensuring data consistency and integrity across large-scale data environments. Proficient in designing and optimizing HUDI workflows to manage incremental data updates and upsert operations seamlessly within Hadoop clusters.Excels in collaborative environments, leveraging his extensive technical knowledge to deliver robust and scalable data solutions.

Karthik P's Current Company Details
UPS

Ups

View
Senior Big Data Engineer
United States
Website:
ups.com
Employees:
164089
Karthik P Work Experience Details
  • Ups
    Senior Big Data Engineer
    Ups
    United States
  • Ups
    Senior Data Engineer
    Ups May 2022 - Present
    Maryland, United States
    Designed and implemented scalable big data pipelines using AWS services like S3, Redshift, EMR, and Athena for real-time logistics analytics. Integrated Palantir with AWS Data Lake solutions to enable advanced analytics and real-time insights for decision-making. Designed and developed workflows in Palantir Foundry to optimize data integration and transformation processes for enterprise-wide reporting. Developed real-time data pipelines in Azure Databricks for Workday and PeopleSoft integration, transforming HR and payroll data into analytics-ready formats. Designed and implemented an Enterprise Data Lake for diverse analytics, processing, storage, and reporting needs, handling large, dynamic datasets. Ensured high-quality reference data through cleaning, transformation, and integrity operations in collaboration with stakeholders and solution architects. Ingested CDC data using HUDI, efficiently managing inserts, updates, and deletes. Utilized tools like EMR for transforming and moving large datasets. Automated data cataloging and ETL jobs, enhancing efficiency and reliability. Integrated Talend with Hadoop, Hive, Spark, Pyspark, and MySQL for seamless data processing. Leveraged Spark SQL for ETL processes using Scala and Python, and conducted unit, integration, and web application testing with Pytest. Developed reusable ETL frameworks for RDBMS to Data Lake transitions. Migrated and maintained databases, converting Oracle and MS SQL Server databases to PostgreSQL and MySQL. Built a multi-terabyte Data Warehouse infrastructure and monitored performance, setting up alerts for system outages and developing ETL job schedules with Matillion ETL package.
  • American Express
    Senior Data Engineer
    American Express Jul 2019 - Apr 2022
    Columbus, Ohio Metropolitan Area
    Designed and built a multi-terabyte Data Warehouse infrastructure for large-scale data handling, managing millions of records daily. Built comprehensive dashboards in Palantir to track key metrics and KPIs for financial data analysis and reporting. Designed and built multi-terabyte Data Warehouse infrastructure on Redshift, incorporating data from Workday and PeopleSoft systems. Established and managed Snowflake architecture, including databases, schemas, and warehouses, to support diverse data requirements. Utilized data cataloging tools for efficient data retrieval and executed SQL queries for analysis. Scheduled, tested, and debugged ETL components using DataStage, and wrote reusable mapplets and Oracle PL/SQL stored procedures. Developed and managed ETL jobs to enhance data warehousing capabilities. Managed ETL pipelines and CDC processes to capture and process real-time data changes, ensuring timely updates. Implemented solutions for automated operational processes and developed SOAP and REST web services. Applied data warehousing concepts in staging tables using advanced ETL tools. Integrated HUDI for efficient processing and incremental data updates, ensuring data consistency and accuracy. Used CDC mechanisms to maintain data freshness by processing updates and inserts promptly.
  • Fragma Data Systems
    Data Engineer
    Fragma Data Systems Dec 2017 - Jun 2018
    Hyderabad, Telangana, India
    Designed and developed Hadoop-based Bigdata analytic solutions and engaged clients in technical discussions. Implementing real-time analytics to derive insights from streaming data using Azure Stream Analytics, based on CDC feeds. Worked on multiple Azure platforms like Azure Data Factory, Azure Synapse, Azure Data Lake, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight. Worked on the creation and implementation of custom Hadoop applications in the Azure environment. Created ADF Pipelines to load data from an on-prem to Azure SQL Server database and Azure Data Lake storage. Developed complicated Hive queries to extract data from various sources (Data Lake) and to store it in HDFS. Used Azure Data Lake Analytics, HDInsight/Databricks to generate Ad Hoc analysis. Developed custom ETL solutions, batch processing, and real-time data ingestion pipeline to move data in and out of Hadoop using PySpark and shell scripting. Data Ingestion to at least one Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks. Worked on all aspects of data mining, data collection, data cleaning, model development, data validation, and data visualization. Experienced in managing Azure Data Lake Storage (ADLS), Databricks Delta Lake and an understanding of how to integrate with other Azure Services.
  • Kgtiger
    Aws Python Developer
    Kgtiger Mar 2016 - Nov 2017
    Hyderabad, Telangana, India
    Orchestrated end-to-end deployment of web applications on AWS, optimizing efficiency and leveraging S3 buckets. Implemented AWS CLI Auto Scaling and CloudWatch Monitoring, enhancing system performance. Implementing CDC to capture changes in customer behavior or market trends in real-time. This involves integrating CDC with Apache Kafka or similar streaming platforms for continuous data ingestion and processing. AWS Glue's Data Catalog helps in organizing metadata, making data discoverable and queryable for analytics and reporting purposes. AWS Glue would have been used to automate the extraction, transformation, and loading of data from various sources into AWS data lakes or data warehouses like Amazon Redshift. Automated continuous integration with Git, Jenkins, and custom Python and Bash tools. Developed server-side modules deployed on AWS Compute Cloud, utilizing languages such as Java, PHP, Node.js, and Python. Utilized AWS Lambda for DynamoDB Auto Scaling and implemented a robust Data Access Layer. Integrating HUDI with AWS service such as S3 for data storage, EMR for processing, and Redshift for data warehousing, ensuring data consistency and accuracy. Automated nightly builds with Python, reducing pipeline failure efforts by 70%. Employed AWS SNS for automated email notifications and messages post nightly runs. Developed tools for AWS server provisioning, application deployment, and basic failover among regions. Apache Hudi enables real-time or near real-time data integration and processing. This is essential for applications requiring timely data updates and analytics.
  • Epam Systems
    Big Data Engineer
    Epam Systems Jun 2013 - Feb 2016
    Hyderabad, Telangana, India
    Provided recommendations for transitioning to Hadoop with MapReduce, Hive, Sqoop, Flume, and Pig Latin. Developed Spark applications for data validation, cleansing, and custom aggregations, importing data into Spark RDDs for processing. Managed cluster operations like node commissioning/decommissioning and high availability. Imported/exported data using Flume and analyzed it with Hive and Pig. Setup and benchmarked Hadoop/HBase clusters, including on Amazon EC2. Developed applications in various Hadoop technologies and integrated Hive with HBase and Sqoop. Transformed relational databases to HDFS and HBase tables with Sqoop. Integrated Talend and SSIS with Hadoop for ETL operations and installed various Hadoop ecosystem components like Hive, Pig, Flume, Sqoop, and Oozie. Utilized Flume for log data collection and aggregation.

Frequently Asked Questions about Karthik P

What company does Karthik P work for?

Karthik P works for Ups

What is Karthik P's role at the current company?

Karthik P's current role is Senior Big Data Engineer.

Who are Karthik P's colleagues?

Karthik P's colleagues are Jenni Timonsson, Mikko Mankin, Reginald Portis, Zhao Melody, Donna Cummings, Scott Wicker, Lynn Khambong.

Not the Karthik P you were looking for?

  • Karthik p

    Sr Devops Engineer
    Sunnyvale, Ca
  • Karthik P

    Qa Automation Engineer || Manual Testing || Selenium || Cypress || Api Testing || Mobile Testing|| Sql || Jenkins || Cucumber|| Git || Performance Testing|| Jmeter||
    Dallas-Fort Worth Metroplex
  • Karthik P

    Lead Python Developer
    United States
  • Karthik P

    Herndon, Va

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.