Pavan K

Pavan K Email and Phone Number

Sr. Data Engineer | 8+ Years in ETL, Cloud Data Solutions, and Big Data Ecosystems | AWS | Azure | GCP| Python | Driving Scalable Data Pipelines & Business Intelligence Solutions @ IBM
new york, new york, united states
Pavan K's Location
Arlington, Texas, United States, United States
About Pavan K

With a Bachelor’s in Engineering from Sathyabama Institute of Science and Technology, I hold over 8 years of experience in data engineering and big data, backed by a strong foundation in ETL, cloud data solutions, and business intelligence. I am passionate about designing scalable, secure data pipelines that drive decision-making in industries like finance, retail, and e-commerce.At IBM, I lead the development of end-to-end ETL pipelines and cloud-based data solutions, using tools like AWS, Azure, and Python to streamline data processes across teams. My previous roles at Deutsche Bank and Western Union focused on high-performance data migrations, Spark application development, and ETL automation, enabling real-time data integration and analysis.Looking forward, I aim to continue building data-driven solutions that optimize business performance while leveraging emerging technologies like machine learning and AI. I’m excited to connect with fellow data professionals and explore opportunities that bring impactful solutions to the data landscape.

Pavan K's Current Company Details
IBM

Ibm

View
Sr. Data Engineer | 8+ Years in ETL, Cloud Data Solutions, and Big Data Ecosystems | AWS | Azure | GCP| Python | Driving Scalable Data Pipelines & Business Intelligence Solutions
new york, new york, united states
Website:
ibm.com
Employees:
512090
Pavan K Work Experience Details
  • Ibm
    Sr. Data Engineer
    Ibm Aug 2022 - Present
    -Designed and implemented end-to-end ETL/ELT data pipelines for efficient ingestion, transformation, and integration across supply chain and retail projects.-Built scalable data pipelines with Python, PySpark, and Azure Databricks, handling diverse data formats like Avro, JSON, and CSV.-Orchestrated data flows with Azure Data Factory and Synapse Analytics, ensuring seamless integration across Azure platforms.-Administered Informatica Cloud Services and PowerCenter to manage structured, semi-structured, and unstructured data for comprehensive data solutions.-Architected and optimized cloud-based data pipelines on AWS (EC2, S3, Lambda, Redshift) for secure and scalable data processing.-Led technical teams in Python and PySpark code reviews, implementing unit tests and ensuring best coding practices.-Identified and resolved production issues in ADF and Databricks, enhancing performance and maintaining reliability.-Automated ETL processes and integrated Snowflake with AWS through custom Python scripts, boosting data processing efficiency.-Engineered data pipelines using AWS Glue, Redshift, Hadoop, Spark, and Kafka for high-performance data operations.-Managed data security with AWS KMS, ensuring compliance with GDPR and HIPAA through encryption and access control.-Conducted security audits and vulnerability scans to safeguard data in both cloud and on-premise environments.-Built data lakes and ETL pipelines using Apache Airflow, Talend, and AWS Glue for structured and unstructured data sources.-Optimized SQL queries in Teradata and AWS environments, enhancing data warehouse performance and resolving bottlenecks.-Developed Tableau dashboards to analyze e-commerce and business data, supporting strategic decision-making.-Applied DevOps practices with Jenkins, Docker, GitHub, and CI/CD pipelines for automated, scalable deployment processes.
  • Deutsche Bank
    Aws Data Engineer
    Deutsche Bank Jan 2021 - Jul 2022
    New York, United States
    -Gathered business requirements and performed logical and physical database design, including data sourcing, transformation, loading, SQL, and performance tuning.-Created ETL packages in SSIS for diverse data loading operations, automating tasks using SQL Server Agent for routine maintenance like backups and indexing.-Developed SSRS reports (drill-down, parameterized, matrix, sub-reports, charts) using relational and OLAP databases to support business intelligence needs.-Built Spark applications in Databricks using Spark-SQL for data extraction, transformation, and aggregation across multiple file formats.-Extracted data from sources like SQL Server, CSV, Excel, and Text files, integrating it into applications for reporting and analytics.-Executed data warehouse migration strategies, transitioning from Oracle and Greenplum to AWS Redshift and Oracle, optimizing data pipelines.-Managed AWS S3 buckets and policies, utilizing S3 and Glacier for storage, backup, and archival purposes.-Developed Spark scripts in Python on AWS EMR for data aggregation, validation, and ad-hoc querying.-Conducted data analytics on Databricks’ DataLake platform using PySpark, optimizing performance.-Designed Power BI architecture and developed Power BI solutions, migrating reports from SSRS for enhanced data visualization.-Built MSBI platform solutions using SSIS and SSRS, handling ETL, reporting, and dashboard creation.-Developed SQL, PL/SQL, and Unix shell scripts, creating procedures, functions, and triggers to automate batch jobs with Autosys.-Loaded data into Amazon Redshift and monitored AWS RDS instances with CloudWatch for system health and performance.-Utilized Teradata utilities (FastLoad, MultiLoad, TPump) for high-performance data loading, with expertise in complex SQL queries and execution plan analysis.-Actively participated in Agile Scrum methodology and CI/CD environment, working with Jenkins and delivering under tight release schedules.
  • Western Union
    Data Engineer
    Western Union Jan 2019 - Dec 2020
    Denver, Colorado, United States
    - Extracted data from relational databases like SQL Server and MySQL by developing Scala and SQL code, and uploaded it to Hive, integrating new tables with existing databases.- Developed code to pre-process large datasets in various formats including Text, Avro, Sequence files, XML, JSON, and Parquet.- Configured big data workflows on top of Hadoop, running heterogeneous jobs using Pig, Hive, Sqoop, and MapReduce.- Loaded structured and unstructured data from the Linux file system into HDFS, ensuring data accessibility for analysis.- Utilized Combiners and Partitioners in MapReduce programming to optimize job execution.- Written Pig scripts to perform ETL tasks and load data into NoSQL databases for faster analysis.- Involved in reading from Flume and pushing batches of data to HDFS and HBase for real-time processing.- Parsed XML data into structured formats and loaded them into HDFS for further processing.- Scheduled various ETL processes and Hive scripts by developing Oozie workflows to automate tasks.- Utilized Tableau to visualize analyzed data, creating and delivering reports based on insights.- Created a POC for Flume implementation, showcasing its effectiveness for data ingestion.- Processed metadata files and loaded them into AWS S3 and Elasticsearch clusters for optimized searching.- Participated in reviewing both functional and non-functional aspects of the business model to ensure alignment with business goals.- Championed the communication and presentation of models to business customers and executives, ensuring clear understanding and alignment with business objectives.
  • Helical It Solutions (Dbt, Databricks, Snowflake, Glue, Quicksight, Talend, Powerbi, Spark, Gen Ai)
    Hadoop Developer
    Helical It Solutions (Dbt, Databricks, Snowflake, Glue, Quicksight, Talend, Powerbi, Spark, Gen Ai) Jan 2016 - Nov 2018
    Hyderabad, Telangana, India
    - Developed PySpark code and Spark-SQL for faster testing and processing of data.- Created Hive tables to load large data sets of structured data coming from data sources after the transformation of raw data.- Developed custom NIFI processors for parsing the data from XML to JSON format and filtering broken files.- Migrated an existing on-premises application to AWS using services like EC2 and S3 for data processing and storage.- Created Hive queries to spot trends by comparing fresh data with EDW reference tables and historical metrics.- Extensively worked on AWS Airflow for scheduling jobs after ETL operations.- Designed and developed ETL processes in AWS Glue to migrate Campaign data from external sources into AWS Redshift.- Implemented business logic by writing UDFs in Spark Scala and configuring CRON Jobs.- Involved in performance tuning and troubleshooting the Hadoop cluster.- Evaluated the suitability of Hadoop and its ecosystem to projects and implemented proof of concept applications for adoption.

Frequently Asked Questions about Pavan K

What company does Pavan K work for?

Pavan K works for Ibm

What is Pavan K's role at the current company?

Pavan K's current role is Sr. Data Engineer | 8+ Years in ETL, Cloud Data Solutions, and Big Data Ecosystems | AWS | Azure | GCP| Python | Driving Scalable Data Pipelines & Business Intelligence Solutions.

Who are Pavan K's colleagues?

Pavan K's colleagues are Toshiba Devpuria, Sushma Pandey, Rashmi Kottalagi, Nadhiya A, Ármin Kovács, Nirupam Chakraborty, Shiju Krishna.

Not the Pavan K you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.