Pavan K Email and Phone Number

new york, new york, united states

Pavan K's Location

Arlington, Texas, United States, United States

About Pavan K

With a Bachelor’s in Engineering from Sathyabama Institute of Science and Technology, I hold over 8 years of experience in data engineering and big data, backed by a strong foundation in ETL, cloud data solutions, and business intelligence. I am passionate about designing scalable, secure data pipelines that drive decision-making in industries like finance, retail, and e-commerce.At IBM, I lead the development of end-to-end ETL pipelines and cloud-based data solutions, using tools like AWS, Azure, and Python to streamline data processes across teams. My previous roles at Deutsche Bank and Western Union focused on high-performance data migrations, Spark application development, and ETL automation, enabling real-time data integration and analysis.Looking forward, I aim to continue building data-driven solutions that optimize business performance while leveraging emerging technologies like machine learning and AI. I’m excited to connect with fellow data professionals and explore opportunities that bring impactful solutions to the data landscape.

Pavan K's Current Company Details

Ibm

View

new york, new york, united states

Website:: ibm.com
Employees:: 512090

Pavan K Work Experience Details

Sr. Data Engineer

Ibm Aug 2022 - Present

-Designed and implemented end-to-end ETL/ELT data pipelines for efficient ingestion, transformation, and integration across supply chain and retail projects.-Built scalable data pipelines with Python, PySpark, and Azure Databricks, handling diverse data formats like Avro, JSON, and CSV.-Orchestrated data flows with Azure Data Factory and Synapse Analytics, ensuring seamless integration across Azure platforms.-Administered Informatica Cloud Services and PowerCenter to manage structured, semi-structured, and unstructured data for comprehensive data solutions.-Architected and optimized cloud-based data pipelines on AWS (EC2, S3, Lambda, Redshift) for secure and scalable data processing.-Led technical teams in Python and PySpark code reviews, implementing unit tests and ensuring best coding practices.-Identified and resolved production issues in ADF and Databricks, enhancing performance and maintaining reliability.-Automated ETL processes and integrated Snowflake with AWS through custom Python scripts, boosting data processing efficiency.-Engineered data pipelines using AWS Glue, Redshift, Hadoop, Spark, and Kafka for high-performance data operations.-Managed data security with AWS KMS, ensuring compliance with GDPR and HIPAA through encryption and access control.-Conducted security audits and vulnerability scans to safeguard data in both cloud and on-premise environments.-Built data lakes and ETL pipelines using Apache Airflow, Talend, and AWS Glue for structured and unstructured data sources.-Optimized SQL queries in Teradata and AWS environments, enhancing data warehouse performance and resolving bottlenecks.-Developed Tableau dashboards to analyze e-commerce and business data, supporting strategic decision-making.-Applied DevOps practices with Jenkins, Docker, GitHub, and CI/CD pipelines for automated, scalable deployment processes.

View
Aws Data Engineer

Deutsche Bank Jan 2021 - Jul 2022

New York, United States

-Gathered business requirements and performed logical and physical database design, including data sourcing, transformation, loading, SQL, and performance tuning.-Created ETL packages in SSIS for diverse data loading operations, automating tasks using SQL Server Agent for routine maintenance like backups and indexing.-Developed SSRS reports (drill-down, parameterized, matrix, sub-reports, charts) using relational and OLAP databases to support business intelligence needs.-Built Spark applications in Databricks using Spark-SQL for data extraction, transformation, and aggregation across multiple file formats.-Extracted data from sources like SQL Server, CSV, Excel, and Text files, integrating it into applications for reporting and analytics.-Executed data warehouse migration strategies, transitioning from Oracle and Greenplum to AWS Redshift and Oracle, optimizing data pipelines.-Managed AWS S3 buckets and policies, utilizing S3 and Glacier for storage, backup, and archival purposes.-Developed Spark scripts in Python on AWS EMR for data aggregation, validation, and ad-hoc querying.-Conducted data analytics on Databricks’ DataLake platform using PySpark, optimizing performance.-Designed Power BI architecture and developed Power BI solutions, migrating reports from SSRS for enhanced data visualization.-Built MSBI platform solutions using SSIS and SSRS, handling ETL, reporting, and dashboard creation.-Developed SQL, PL/SQL, and Unix shell scripts, creating procedures, functions, and triggers to automate batch jobs with Autosys.-Loaded data into Amazon Redshift and monitored AWS RDS instances with CloudWatch for system health and performance.-Utilized Teradata utilities (FastLoad, MultiLoad, TPump) for high-performance data loading, with expertise in complex SQL queries and execution plan analysis.-Actively participated in Agile Scrum methodology and CI/CD environment, working with Jenkins and delivering under tight release schedules.

View
Data Engineer

Western Union Jan 2019 - Dec 2020

Denver, Colorado, United States

- Extracted data from relational databases like SQL Server and MySQL by developing Scala and SQL code, and uploaded it to Hive, integrating new tables with existing databases.- Developed code to pre-process large datasets in various formats including Text, Avro, Sequence files, XML, JSON, and Parquet.- Configured big data workflows on top of Hadoop, running heterogeneous jobs using Pig, Hive, Sqoop, and MapReduce.- Loaded structured and unstructured data from the Linux file system into HDFS, ensuring data accessibility for analysis.- Utilized Combiners and Partitioners in MapReduce programming to optimize job execution.- Written Pig scripts to perform ETL tasks and load data into NoSQL databases for faster analysis.- Involved in reading from Flume and pushing batches of data to HDFS and HBase for real-time processing.- Parsed XML data into structured formats and loaded them into HDFS for further processing.- Scheduled various ETL processes and Hive scripts by developing Oozie workflows to automate tasks.- Utilized Tableau to visualize analyzed data, creating and delivering reports based on insights.- Created a POC for Flume implementation, showcasing its effectiveness for data ingestion.- Processed metadata files and loaded them into AWS S3 and Elasticsearch clusters for optimized searching.- Participated in reviewing both functional and non-functional aspects of the business model to ensure alignment with business goals.- Championed the communication and presentation of models to business customers and executives, ensuring clear understanding and alignment with business objectives.

View
Hadoop Developer

Helical It Solutions (Dbt, Databricks, Snowflake, Glue, Quicksight, Talend, Powerbi, Spark, Gen Ai) Jan 2016 - Nov 2018

Hyderabad, Telangana, India

- Developed PySpark code and Spark-SQL for faster testing and processing of data.- Created Hive tables to load large data sets of structured data coming from data sources after the transformation of raw data.- Developed custom NIFI processors for parsing the data from XML to JSON format and filtering broken files.- Migrated an existing on-premises application to AWS using services like EC2 and S3 for data processing and storage.- Created Hive queries to spot trends by comparing fresh data with EDW reference tables and historical metrics.- Extensively worked on AWS Airflow for scheduling jobs after ETL operations.- Designed and developed ETL processes in AWS Glue to migrate Campaign data from external sources into AWS Redshift.- Implemented business logic by writing UDFs in Spark Scala and configuring CRON Jobs.- Involved in performance tuning and troubleshooting the Hadoop cluster.- Evaluated the suitability of Hadoop and its ecosystem to projects and implemented proof of concept applications for adoption.

View

Frequently Asked Questions about Pavan K

What company does Pavan K work for?

Pavan K works for Ibm

What is Pavan K's role at the current company?

Who are Pavan K's colleagues?

Pavan K's colleagues are Toshiba Devpuria, Sushma Pandey, Rashmi Kottalagi, Nadhiya A, Ármin Kovács, Nirupam Chakraborty, Shiju Krishna.

Not the Pavan K you were looking for?

Pavan K

Java Full Stack Developer

Alpharetta, Ga

View
Pavan K

Aws Certified Solutions Architect Associate | Arizona State University Grad | Discount Tire | Tata Consultancy Services

United States

View
Pavan K

Senior Data Engineer

Fremont, Ca

View
pavan k

Full Stack Java Developer

Chicago, Il

View
Pavan K

Sr Data Scientist

St Louis, Mo

View

View similar profiles

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles

Get direct phone numbers & mobile contacts

Access company data & employee information

Works directly on LinkedIn - no copy/paste needed

Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.

Security Check