Gautham D Email and Phone Number

Data Engineer at @ PNC

Celina, TX, US

Gautham D's Location

Celina, Texas, United States, United States

About Gautham D

I'm a passionate Big Data Architect with over 6 years of experience designing and implementing high-performance data solutions across the entire Big Data landscape. My expertise spans industry-leading technologies like Apache Hadoop, Spark (encompassing Spark Core, SQL, and Streaming), Kafka, and the full suite of AWS Cloud services.I excel at building robust data pipelines that seamlessly handle structured, semi-structured, and unstructured data. Whether it's crafting complex ETL (Extract, Transform, Load) processes or mastering data ingestion techniques, I ensure efficient data movement and transformation to unlock valuable insights.Spark and Scala are my tools of choice for real-time data processing. I leverage these powerful technologies to build applications that perform event enrichment, data aggregation, and de-normalization, empowering businesses to make data-driven decisions in the blink of an eye.Cloud migration is another area of expertise. I possess a deep understanding of AWS Cloud and have a proven track record of architecting and migrating data pipelines to this robust platform. From utilizing services like EC2, S3, and EMR for infrastructure and storage to employing CloudFormation for infrastructure automation, I ensure seamless cloud adoption.But my skillset extends beyond the Big Data ecosystem. I'm proficient in implementing OLAP multi-dimensional cubes using Azure Data Warehouse and Databricks, enabling advanced data analysis capabilities. Additionally, I champion data security by effectively managing access and user privileges through IAM in both AWS and Snowflake schemas.I thrive in fast-paced environments and consistently deliver projects on tight deadlines. My test-driven development (TDD), behavior-driven development (BDD), and acceptance test-driven development (ATDD) approach ensure code quality and project success.In essence, I'm a Big Data professional driven by a desire to empower businesses. By leveraging cutting-edge technologies and fostering a collaborative spirit, I build secure and scalable data pipelines that unlock the full potential of your organization's data.

Gautham D's Current Company Details

Pnc

View

Data Engineer

Celina, TX, US

Website:: pnc.com
Employees:: 62437

Gautham D Work Experience Details

Data Engineer

Pnc

Celina, Tx, Us

View
Data Engineer

Massmutual Nov 2023 - Present

Springfield, Massachusetts Metropolitan Area

I'm a Big Data Engineer with a focus on designing and implementing scalable data pipelines and analytics solutions. I leverage technologies like Apache Spark, Scala, Apache Hive, and the AZURE ecosystem to empower businesses to unlock the power of their data.I was responsible for gathering, analyzing, and designing system requirements. I developed Spark programs in Scala to compare the performance of Spark against Hive and SparkSQL. I also created a Spark streaming application that consumed JSON messages from Kafka and performed necessary transformations.I leveraged Spark API over Hortonworks Hadoop YARN to conduct data analytics on Hive data. For faster testing and data processing, I implemented Spark using Scala and SparkSQL. I was involved in developing a MapReduce framework that filtered out bad and unnecessary records.I designed and implemented data pipelines in Azure using Data Factory and Databricks, handling data extraction, transformation, and loading. I used PySpark and Spark SQL for complex transformations and optimized data storage.I migrated data from on-premises SQL Server to Azure, ensuring secure management with Azure key vaults. I created Grafana dashboards for data analysis and used Azure App Insights for troubleshooting.I developed Python-based RESTful APIs and integrated AJAX-driven functionality. I used Git and GitHub for version control and collaboration

View
Big Data Engineer

At&T Jan 2023 - Oct 2023

Dallas, Texas, United States

I developed Flume agents to process web server logs and load them into MongoDB. I have expertise in MongoDB, including data modeling, tuning, and disaster recovery.I built data pipelines using Azure Data Factory to load data from on-premises sources to Azure. I worked with copy activities, implemented error handling, and utilized various Data Factory activities.I configured Logic Apps for email notifications and created dynamic pipelines for multiple sources and targets. I migrated data from on-premises SQL Server to Azure databasesI designed and implemented data pipelines using Azure Data Factory and Databricks. I used PySpark and Spark SQL for complex transformations and optimized data storage with parquet formats.I implemented IoT streaming with Databricks Delta tables and Delta Lake. I integrated Azure Logic Apps, Functions, Storage, and Service Bus Queues for ERP systems.I analyzed business requirements and designed one-time load strategies for migrating large databases to Azure SQL DWH. I used Azure Data Factory and HDInsight to extract, transform, and load data.I created Spark clusters in Azure Databricks for data preparation and used stored procedures, lookups, pipelines, data flows, copy data, and Azure functions in ADF. I estimated cluster size, monitored, and troubleshot Spark clusters.

View
Data Engineer

Aig Feb 2022 - Dec 2022

Irving, Texas, United States

I developed Spark scripts using Scala and Java for data processing and analysis. I used Spark API over Hadoop for Hive analytics. I created Scala scripts, UDFs, and optimized algorithms using Spark Context, Spark-SQL, Data Frames, and Pair RDDs.I have experience handling large datasets efficiently using partitions, in-memory capabilities, broadcasts, joins, transformations, and other techniques. I developed a Spark streaming pipeline to parse JSON data and store it in Hive tables.I worked with Sqoop to import metadata from Oracle, created Hive tables, and analyzed data using Hive queries. I implemented schema extraction for Parquet and Avro file formats in Hive.I used Talend Open Studio for ETL jobs, implemented partitioning, and collaborated with other teams for data quality. I migrated applications to ADLS and used CloudWatch logs.I wrote HiveQL queries, processed data in Spark, and stored results in Hive tables. I imported data from Oracle and Postgres using SQOOP. I migrated HiveQL to Impala for performance improvement.I implemented Avro and Parquet data formats for Hive computations, performed data validation, and created Sqoop jobs and Hive scripts for data ingestion and comparison.

View
Software Engineer

Othmap Feb 2018 - Dec 2021

Hyderabad, Telangana, India

Migrated data from on-premises sources to AWS S3 buckets using custom Python scripts.Developed Python scripts to interact with REST APIs and extract data directly to S3.Built data ingestion pipelines utilizing AWS Lambda, Glue, and Step Functions for data cleansing and transformation.Created YAML files to define data sources, Glue tables, and stack creation for each data pipeline.Extracted data from Netezza databases and transferred it to S3 using a custom Python script.Developed and deployed Lambda functions with IAM roles to execute Python scripts triggered by various events (SQS, EventBridge, SNS).Created a Lambda function to automatically handle data upload events from S3 buckets.Designed and developed Informatica mappings to integrate data from various sources, targets, and transformations.Utilized Informatica transformations (Expression, Filter, Joiner, Lookup) for data cleansing, consistency, and efficient data migration.Designed and implemented Sqoop for incremental data loading from DB2 to Hive tables.Connected Hive to Tableau for generating interactive reports using Hive Server2.Used Sqoop to transfer data between HDFS, RDBMS, and other sources.Developed Spark applications using PySpark and Spark-SQL for complex data extraction, transformation, and aggregation from diverse file formats.Built Spark Streaming applications to ingest real-time data from Kafka and store it in HDFS and NoSQL databases (HBase, Cassandra).Collected and processed near-real-time data from S3 buckets using Spark Streaming, performing transformations and aggregation to build data models and store them in HDFS.Apache NiFi for data transfer from local file systems to HDP.Data Modeling (Star/Snowflake Schemas, OLTP/OLAP Systems), using Erwin for Conceptual, Logical, and Physical Modeling.Oozie for automating data loading into HDFS.

Frequently Asked Questions about Gautham D

What company does Gautham D work for?

Gautham D works for Pnc

What is Gautham D's role at the current company?

Gautham D's current role is Data Engineer.

Who are Gautham D's colleagues?

Gautham D's colleagues are Jordan Bonar, Sean Mcdonald, Frank Fuerte, Aubrey Paul, Steve Basic, Kimberly Dombroski, Diane Spyra.

Not the Gautham D you were looking for?

Gautham D

Data Analytics Professional | Strong In Etl, Data Cleaning, And Statistical Analysis | Driving Business Success Through Data"

Denton, Tx

View
Gautham D

Aspiring Software Engineer/ Data Analyst

United States

View
Gautham D

Troy, Mi

View
Gautham Pallapa, Ph. D.

San Francisco Bay Area

View

3
gmail.com, vmware.com, west.com

3 +140271XXXXX

View similar profiles

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles

Get direct phone numbers & mobile contacts

Access company data & employee information

Works directly on LinkedIn - no copy/paste needed

Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.

Security Check