Sai Kumar

Sai Kumar Email and Phone Number

Senior Data Engineer @ TD
Hamilton, ON, CA
Sai Kumar's Location
Hamilton, Ontario, Canada, Canada
About Sai Kumar

As a Senior Data Engineer at TD, I am responsible for the execution of big data analytics, predictive analytics, and machine learning initiatives using AWS and Hadoop technologies. I have developed and deployed scalable and reliable data pipelines using Scala, Spark, Kafka, and Redshift, and integrated them with web applications hosted on AWS cloud services. I have also leveraged Spark Streaming and Machine Learning to process and analyze real-time data from various sources and generate insights for business decision-making.Experience in Azure Cloud, Azure Data Factory, Azure Data Lake storage, Azure Synapse Analytics, Azure Analytical services, Azure Cosmos NO SQL DB, Azure Big Data Technologies (Hadoop and Apache Spark) and Data bricks.Experience in developing pipelines in spark using Scala and PySpark.Experience in building ETL (Azure Data Bricks) data pipelines leveraging PySpark, Spark SQL.With over 5.5 years of professional experience, I have gained expertise in various Hadoop ecosystem components, such as HDFS, Hive, Sqoop, HBase, Zookeeper, and Oozie, and used them to ingest, transform, and store data from various sources. I have also extensively used Python libraries, such as PySpark, Pytest, PyMongo, and Pandas, to perform data manipulation, testing, and visualization. I am an AWS Certified Developer Associate, and have experience in using AWS services, such as EC2, S3, Glue, Lambda, and DynamoDB, to create and manage cloud-based data solutions. I am passionate about learning new technologies and solving complex data problems.

Sai Kumar's Current Company Details
TD
Senior Data Engineer
Hamilton, ON, CA
Employees:
1
Sai Kumar Work Experience Details
  • Td
    Senior Data Engineer
    Td
    Hamilton, On, Ca
  • Td
    Senior Data Engineer
    Td Mar 2023 - Present
    ● Responsible for the execution of big data analytics, predictive analytics, and machine learning initiatives.● Developed Scala scripts, UDF are using both data frames/SQL and RDD in Spark for data aggregation, queries and writing back into S3 bucket.● Wrote, compiled, and executed programs as necessary using Apache Spark in Scala to perform ETL jobs with ingested data. Worked on Big data on AWS cloud services i.e., EC2, S3, EMR and DynamoDB● Implemented End-to-end solution for hosting the web application on AWS cloud with integration to S3 buckets.⦁ Created complex ETL Azure Data Factory pipelines using mapping data flows with multiple Input/output transformations⦁ Worked on Azure BLOB and Data Lake storage and loading data into Azure SQ.⦁ Worked with Azure SQL Database Import and Export Service.⦁ Used Azure Key vault as central repository for maintaining secrets and referenced the secrets in Azure Data Factory and also in Databricks notebooks.⦁ Built a common SFTP download or upload framework using Azure Data Factory and Databricks.● Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing. ● Involved in designing & deploying multi-tier applications using all AWS services (EC2, Route53, S3, RDS, Dynamo DB, SNS, SQS, IAM) focusing on high-availability, fault tolerance, and auto-scaling in AWS Cloud Formation● Wrote Spark applications for data validation, cleansing, transformation, and custom aggregation and used Spark engine, Spark SQL for data analysis and provided to the data scientists for further analysis.● Developed rest API's using python with flask and Django framework.⦁ Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks
  • Cybage Software
    Aws Data Engineer
    Cybage Software May 2021 - Dec 2022
    India
    ● Involved in all phases of SDLC including Requirement Gathering, Design, Analysis, and Testing of customer specifications, Development, and Deployment of the Application, and designing a reliable and scalable data pipeline. ● Ingested user data from external servers such as FTP server and S3 buckets on daily basis using custom Input Adapters.● Design and Develop complex Data pipelines using Sqoop, Spark, and Hive to Ingest, transform and analyze customer behavior data.● Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data.● Created Sqoop scripts to import/export user profile data from RDBMS to S3 Data Lake.● Developed various spark applications using Scala to perform various enrichments of user behavioral data (click stream data) merged with user profile data.● Involved in data cleansing, event enrichment, data aggregation, de-normalization and data preparation needed for downstream model learning, and reporting.● Develop and add features to existing data analytic applications built with Spark and Hadoop on a Scala, Python development platform on top of AWS services.● Programming using Python, and Scala along with Hadoop framework utilizing Cloudera Hadoop Ecosystem projects (HDFS, Spark, Sqoop, Hive, HBase, Oozie, Impala, Zookeeper, etc.).● Utilized Spark Scala API to implement batch processing of jobs.● Troubleshooting Spark applications for improved error tolerance. ● Fine-tuning spark applications/jobs to improve the efficiency and overall processing time for the pipelines.● Worked on creating Kafka producer API to send live-stream data into various Kafka topics.● Developed Spark-Streaming applications to consume the data from Kafka topics and to insert the processed streams into Redshift.● Utilized Spark in Memory capabilities, to handle large datasets. ● Used Scala collection framework to store and process complex consumer information.
  • Accenture
    Cloud Data Engineer
    Accenture Jan 2019 - Apr 2021
    Hyderabad, Telangana, India
    • Responsible for the execution of big data analytics, predictive analytics, and machine learning initiatives.• Implemented a proof of concept deploying this product in AWS S3 bucket and Snowflake.• Utilized AWS services with a focus on big data architect/analytics/enterprise Data warehouse and business intelligence solutions to ensure optimal architecture, scalability, flexibility, availability, and performance, and to provide meaningful and valuable information for better decision-making.• Used Informatica across different data sources to collect, manage and distribute data with accuracy and consistency• Developed Scala scripts, and UDFs using both data frames/SQL and RDD in Spark for data aggregation, queries, and writing back into the S3 bucket. Experience in data cleansing and data mining.• Wrote, compiled, and executed programs as necessary using Apache Spark in Scala to perform ETL jobs with ingested data.• Used Spark Streaming to divide streaming data into batches as an input to Spark for batch processing.• Wrote Spark applications for data validation, cleansing, transformation, and custom aggregation and used Spark engine, and Spark SQL for data analysis and provided to the data scientists for further analysis.• Prepared scripts to automate the ingestion process using Python and Scala as needed through various sources such as API, AWS S3, Teradata, and snowflake.• Designed and Developed Spark workflows using Scala for data pull from AWS S3 bucket and Snowflake applying transformations on it.• Ability to coordinate with offshore and onshore teams to ensure project timelines and deliverables are met• Experienced in testing, debugging and troubleshooting Glue and PySpark scripts• Experienced with optimizing and scaling the PySpark scripts for better performance• Implemented Spark RDD transformations to Map business analysis and apply actions on top of transformations.

Frequently Asked Questions about Sai Kumar

What company does Sai Kumar work for?

Sai Kumar works for Td

What is Sai Kumar's role at the current company?

Sai Kumar's current role is Senior Data Engineer.

Not the Sai Kumar you were looking for?

  • Sai Kumar

    Data Engineer At Atb Financial
    Scarborough, On
  • Sai Kumar

    Devops Technical Lead @ Aws Cloud Computing
    Greater Kitchener-Cambridge-Waterloo Metropolitan Area
  • Sai Kumar

    Scarborough, On
  • Sai K.

    Bi Engineer | Data Analytics | Azure Data Engineer | Let'S Connect....
    Kitchener, On

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.