Sai Kumar

Sai Kumar Email and Phone Number

Data Engineer @ ATB Financial
Toronto, ON, CA
Sai Kumar's Location
Scarborough, Ontario, Canada, Canada
About Sai Kumar

I am an experienced IT professional with around 6 years of expertise specializing in Big Data technologies and Hadoop frameworks. My expertise spans across the Hadoop ecosystem, including HDFS, YARN, MapReduce, Hive, Impala, Pig, Sqoop, HBase, Spark, Spark SQL, Kafka, Spark Streaming, Flume, Oozie, Zookeeper, and Hue. I have extensive experience in developing, testing, documenting, deploying, and integrating solutions using SQL and Big Data technologies. I have a strong background in distributed systems, HDFS architecture, and the internal workings of MapReduce and Spark processing frameworks.I have deployed Big Data applications using Talend on cloud-based ETL platforms, such as AWS and Microsoft Azure, and have experience orchestrating end-to-end data integration pipelines using Azure Data Factory. I am proficient in developing notebooks in Azure Databricks, working with Delta Lake, and managing credentials in Azure Key Vault. Additionally, I have hands-on experience setting up workflows using Apache Airflow and Oozie workflow engines to manage and schedule Hadoop jobs.I possess solid knowledge in machine learning algorithms, such as logistic regression, random forest, KNN, SVM, ensemble models, neural networks, regression techniques, and k-means clustering. I am skilled in handling real-time streaming data using Kafka, Flume, and Spark Streaming, as well as optimizing Hive tables with partitions and bucketing for improved query performance. I have utilized Spark-SQL to read data from Hive tables and perform data cleansing, validation, transformation, and aggregation as per business requirements.I am also an experienced Power BI and Tableau Developer, skilled in building and publishing customized interactive reports and dashboards. I have developed data ingestion modules using AWS Step Functions, AWS Glue, and Python, and have deployed cloud-based services using various AWS tools, such as Databases, Migration, Compute, IAM, Storage, Analytics, Network & Content Delivery, Lambda, and Application Integration.With my proficiency in programming languages like C, SQL, and Python, I have successfully managed the complete project lifecycle for client-server and web applications, driving data modeling, mining, and recommending strategies to enhance data reliability, efficiency, and quality.

Sai Kumar's Current Company Details
ATB Financial

Atb Financial

View
Data Engineer
Toronto, ON, CA
Website:
atb.com/wealth
Employees:
412
Sai Kumar Work Experience Details
  • Atb Financial
    Data Engineer
    Atb Financial
    Toronto, On, Ca
  • Atb Financial
    Data Engineer
    Atb Financial Dec 2023 - Present
    Toronto, Ontario, Canada
    • Developed methods for Create, Read, Update and Delete (CRUD) in Active Record. Conducted performance tuning and optimization of Kubernetes and Docker deployments to improve overall system performance.• Used Django evolution and manual SQL modifications were able to modify Django models while retaining all data, while site was in production mode. Presented the project to faculty and industry experts, showcasing the pipeline's effectiveness in providing real-time insights for marketing and brand management.• Analyzed existing systems and propose improvements in processes and systems for usage of modern scheduling tools like Airflow and migrating the legacy systems into an Enterprise data lake built on Azure Cloud.• Responsible for Building and Testing of applications. Experience in handling database issues and connections with SQL and NoSQL databases like MongoDB by installing and configuring various packages in python (Teradata, MySQL, MySQL connector, PyMongo and SQLAlchemy).• Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and Azure Data Lake Analytics. Data Ingestion to one or more Azure and processing the data in In Azure Databricks. Build Jenkins jobs for CI/CD Infrastructure for GitHub repos.• Imported real time weblogs using Kafka as a messaging system and ingested the data to Spark Streaming and did data quality checks using Spark Streaming and arranged bad and passable flags on the data.• Experience in creating Kubernetes replication controllers, Clusters and label services to deployed Microservices in Docker.• Used Python to write Data into JSON files for testing Django Websites, Created scripts for data modelling and data import and export. Led requirement gathering, business analysis, and technical design for Hadoop and Big Data projects.
  • Intelcom | Dragonfly
    Data Engineer
    Intelcom | Dragonfly Sep 2022 - Dec 2023
    Montreal, Quebec, Canada
    • Trained and documented initial deployment and Supported product stabilization/debugging at the deployment stage.• Developed a fully automated continuous integration system using Git, Jenkins, MySQL and custom tools developed in Python and Bash. Used Python based GUI components for the Front End functionality such as selection criteria.• Developed data connectors to extract data from social media APIs and integrated them into the pipeline using Python and Apache Kafka Connect. Written queries in MySQL and Native SQL.• Deployed models as python package, as API for backend integration and as services in a microservices architecture with a Kubernetes orchestration layer for the Dockers containers.• Worked on Kafka streaming on subscriber side, processing the messages and inserting them into the db. And Apache Spark for real-time data processing. Created clusters to classify control and test groups.• Involved in various phases of Software Development Lifecycle (SDLC) of the application, like gathering requirements, design, development, deployment, and analysis of the application. Responsible for loading the data from BDW Oracle database, Teradata into HDFS using Sqoop. Implemented AJAX, JSON, and Java script to create interactive web screens.• Consult leadership/stakeholders to share design recommendations and thoughts to identify product and technical requirements, resolve technical problems and suggest Big Data based analytical solutions.• Spearheaded HBase setup and utilized Spark and SparkSQL to develop faster data pipelines, resulting in a 60% reduction in processing time and improved data accuracy. Designed and develop JAVA API (Commerce API) which provides functionality to connect to the Cassandra through Java services.• Worked on CI/CD tools like Jenkins, Docker in Devops Team for setting up application process from end-to-end using Deployment for lower environments and Delivery for higher environments by using approvals in between.
  • Accenture
    Data Engineer
    Accenture May 2018 - Aug 2021
    India
    • Utilized Sqoop to ingest real-time data. Used analytics libraries Sci-Kit Learn, MLLIB and MLxtend. Extensively used Python's multiple data science packages like Pandas, NumPy, and matplotlib, Seaborn, SciPy, Scikit-learn and NLTK.• Performed Exploratory Data Analysis, trying to find trends and clusters. Built models using techniques like Regression, Tree based ensemble methods, Time Series forecasting, KNN, Clustering and Isolation Forest methods.• Worked on data that was a combination of unstructured and structured data from multiple sources and automated the cleaning using Python scripts. Built scalable and deployable machine learning models• Extensively performed large data read/writes to and from csv and excel files using pandas.• Communicated and coordinated with other departments to collection business requirement. Tackled highly imbalanced Fraud dataset using under sampling with ensemble methods, oversampling and cost sensitive algorithms.• Improved fraud prediction performance by using random forest and gradient boosting for feature selection with Python Scikit-learn. Tasked with maintaining RDD's using SparkSQL. Implemented machine learning model (logistic regression, XGboost) with Python Scikit- learn. Optimized algorithm with stochastic gradient descent algorithm Fine-tuned the algorithm parameter with manual tuning and automated tuning such as Bayesian Optimization Developed a technical brief based on the business brief. This contains detailed steps and stages of developing and delivering the project including timelines. After sign-off from the client on technical brief, started developing the SAS codes.• Wrote the data validation SAS codes with the help of Univariate, Frequency procedures

Sai Kumar Education Details

Frequently Asked Questions about Sai Kumar

What company does Sai Kumar work for?

Sai Kumar works for Atb Financial

What is Sai Kumar's role at the current company?

Sai Kumar's current role is Data Engineer.

What schools did Sai Kumar attend?

Sai Kumar attended Jntuh College Of Engineering Hyderabad, Sault College Of Applied Arts And Technology.

Not the Sai Kumar you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.