Suprith Son Dubba Email and Phone Number
Senior Data Engineer. cloud migrations (AWS, Azure, GCP), and building scalable data solutions. I specialize in designing and implementing data pipelines, ETL workflows, and data warehousing solutions using tools like Hadoop, Spark, Kafka, and Hive. My work spans across cloud platforms (AWS, Azure, GCP), where I’ve successfully migrated on-prem systems, optimized data processing, and delivered business insights through Power BI, Tableau, and SQL.I am skilled in data modeling, analytics, and cloud integration, using Agile and Scrum methodologies to lead complex projects. Passionate about leveraging data to drive efficiency and business value, I focus on delivering high-performance, scalable solutions for enterprise needs.
Truist
View- Website:
- truist.com
- Employees:
- 8972
-
Senior Data EngineerTruist Jan 2022 - PresentCharlotte, North Carolina, United StatesMigrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for small data sets processing and storage, Experienced in Maintaining the Hadoop cluster on AWS EMR. Imported data from AWS S3 into Spark RDD, Performed transformations and actions on RDD's.Developed Spark Applications by using Scala Implemented Apache Spark data processing project to handle data from various RDBMS and Streaming sources. Worked with Spark for improving performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Spark MLlib, Data Frame, Pair RDD's, Spark YARN.Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in near real time and persist it to Cassandra.Created, altered, and deleted topics (Kafka Queues) when required with varying Performance tuning using partitioning, bucketing of Impala tables.Involved in file movements between HDFS and AWS S3 and extensively worked with S3 bucket in AWS.Created data partitions on large data sets in S3 and DDL on partitioned data.Worked with Elastic MapReduce and setup Hadoop environment in AWS EC2 Instances.Converted all Hadoop jobs to run in EMR by configuring the cluster according to the data size. -
Senior Data EngineerCardinal Health Jul 2019 - Dec 2021Dublin, Oh, United StatesDeveloped Spark scripts using Python on Azure HDInsight for data aggregation, validation, and performance verification of vehicle telemetry data, optimizing over traditional MapReduce jobs, and built pipelines to move hashed and un-hashed EV performance data from Azure Blob to Data Lake.Utilized Azure HDInsight to monitor and manage Hadoop clusters, performing advanced procedures like analyzing EV sensor data for text analytics and in-memory computing using Spark with Python, and created pipelines for transferring on-premises vehicle diagnostic data to Azure Data Lake.Worked on ingesting EV telematics and battery performance data into various Azure services, such as Azure Data Lake, Azure Storage, Azure SQL, and Azure Data Warehouse. Processed data in Azure Databricks while building complex ETL jobs to transform vehicle diagnostics and energy usage data visually using Azure Data Factory, Databricks, and Azure SQL Database.Analyzed SQL scripts, designed solutions for implementation using PySpark, and enhanced and optimized Spark scripts for tasks like aggregating vehicle efficiency data, grouping fleet performance metrics, and mining insights from EV telemetry. Loaded vehicle performance data into Spark RDD for in-memory computations to evaluate real-time driving patterns.Converted Hive/SQL queries into Spark transformations using Spark RDDs and PySpark to analyze EV battery lifecycle metrics, optimized algorithms in Hadoop using Spark Context, Spark SQL, DataFrames, and Pair RDDs for vehicle energy usage. Performed analytics on EV data using Spark API over Hadoop YARN. -
Data EngineerT-Mobile Nov 2017 - Jun 2019Overland Park, Kansas, United StatesMigrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for small data sets processing and storage, Experienced in Maintaining the Hadoop cluster on AWS EMR. Imported data from AWS S3 into Spark RDD, Performed transformations and actions on RDD's.Developed Spark Applications by using Scala Implemented Apache Spark data processing project to handle data from various RDBMS and Streaming sources. Worked with Spark for improving performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Spark MLlib, Data Frame, Pair RDD's, Spark YARN.Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in near real time and persist it to Cassandra.Created, altered, and deleted topics (Kafka Queues) when required with varying Performance tuning using partitioning, bucketing of Impala tables.Involved in file movements between HDFS and AWS S3 and extensively worked with S3 bucket in AWS.Created data partitions on large data sets in S3 and DDL on partitioned data.Worked with Elastic MapReduce and setup Hadoop environment in AWS EC2 Instances.Converted all Hadoop jobs to run in EMR by configuring the cluster according to the data size.Monitored and troubleshot Hadoop jobs using Yarn Resource Manager and EMR job logs using Genie and kibana.Created YAML file to push the application in Pivotal Cloud Foundry. Deployed Spark application and java web services in pivotal cloud foundry.Implemented rapid-provisioning and life-cycle management for using Amazon EC2, Chef, and custom Ruby/Bash scripts.Created RDD's in Spark technology and extracted data from data warehouse on to the Spark RDD’s.Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement.Involved in configuring core-site.xml and mapred-site.xml according to the multi node cluster environment. -
Data EngineerSopan Technologies Jul 2015 - Aug 2017Hyderabad, Telangana, IndiaAs a Big Data Engineer at Sopan Technologies, I worked on large-scale data processing and analytics projects utilizing Hadoop and Spark technologies. My primary focus was on processing raw data at scale and building efficient ETL pipelines for real-time and batch data processing. Processed raw data at scale using the Hadoop Big Data platform, loading disparate datasets from various environments into HDFS.Developed ETL data flows using Hadoop ecosystem components such as Spark, Spark Streaming, and Spark SQL with Scala, ensuring efficient data processing.Led the development of large-scale, high-speed, and low-latency data solutions for real-time reporting, data warehousing, and long-term data storage. Improved performance and optimized existing algorithms using Spark Context, Spark-SQL, and DataFrame APIs, enhancing the efficiency of data processing. -
Data EngineerKeypoint Technologies May 2013 - Jun 2015Gachibowli,HyderabadAs a Hadoop Engineer at KeyPoint Technologies, I was responsible for developing scalable data solutions using the Hadoop ecosystem and Snowflake for handling large-scale data processing and storage. Worked on Snowflake Shared Technology Environment to provide stable infrastructure, secure environments, reusable frameworks, and automated utilities such as Secured Database Connections, Code Review, Build Process, and Deployment Process (SCBD).Migrated data from Amazon Redshift data warehouse to Snowflake, utilizing ETL processes to transfer data from sources to targets.Built dimensional data vault architecture on Snowflake for scalable and optimized data storage and retrieval.Developed a scalable Hadoop cluster running Hortonworks Data Platform (HDP 2.6) to handle large-scale data processing.Developed Spark code using Scala and Spark-SQL for faster data processing and optimized performance by leveraging Spark Context, PairRDDs, and Spark SQL.
Suprith Son Dubba Education Details
Frequently Asked Questions about Suprith Son Dubba
What company does Suprith Son Dubba work for?
Suprith Son Dubba works for Truist
What is Suprith Son Dubba's role at the current company?
Suprith Son Dubba's current role is Senior Data Engineer | | Python| Cloud Specialist (AWS, Azure,GCP) | ETL & Data Pipeline Development | Data Warehousing & Analytics Enthusiast |Power BI & Tableau Reporting.
What schools did Suprith Son Dubba attend?
Suprith Son Dubba attended Mvsr Engineering College.
Who are Suprith Son Dubba's colleagues?
Suprith Son Dubba's colleagues are Kerrie Mcgarrigle, Kristin Lineberry, Robert Stephens, Scott Edmondson, Jhon Carvajal, Christina Zutty, Jacob Herrin.
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial