Guru Sai

Guru Sai Email and Phone Number

Programmer Analyst @ Cognizant
Framingham, MA, US
Guru Sai's Location
Dallas, Texas, United States, United States
About Guru Sai

Guru Sai is a Programmer Analyst at Cognizant.

Guru Sai's Current Company Details
Cognizant

Cognizant

View
Programmer Analyst
Framingham, MA, US
Guru Sai Work Experience Details
  • Cognizant
    Programmer Analyst
    Cognizant
    Framingham, Ma, Us
  • Citrix
    Senior Data Engineer
    Citrix Jan 2021 - Present
    Fort Lauderdale, Fl, Us
    • Developed Nifi workflow to pick up the data from rest API server, from data lake as well as from SFTP server and send that to Kafka broker.• Experienced with handling administration activations using Cloudera manager.• Create and maintain optimal data pipeline architecture in cloud Microsoft Azure using Data Factory and Azure Databricks.• Implemented the workflows using Apache Oozie framework to automate tasks.• Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate visualizations using Tableau.• Built pipelines to move hashed and un-hashed data from XML files to Azure Data lake.• Developed Spark scripts using Python on Azure HDInsight for Data Aggregation, Validation and verified its performance over MR jobs.• Analyzed existing databases, tables and other objects to prepare to migrate to Azure Synapse.• Used Kafka functionalities like distribution, partition, replicated commit log service for messaging systems by maintaining feeds and Created applications using Kafka.• Worked on on-prem Datawarehouse migration to Azure Synapse using polybase and ADF.• Worked with NoSQL databases like HBase in making HBase tables to load expansive arrangements of semi structured data.• Acted for bringing in data under HBase using HBase shell also HBase client API.• Created ETL Mapping with Talend Integration Suite to pull data from Source, apply transformations, and load data into target database.• Written Spark applications using Scala to interact with the PostgreSQL database using Spark SQL Context and accessed Hive tables using Hive Context.• Writing PySpark and Spark-SQL transformation in Azure Databricks to perform complex transformations for business rule implementation.• Built the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies like Hadoop Hive, Azure Data Lake storage.
  • Amway
    Data Engineer
    Amway May 2018 - Dec 2020
    Ada, Michigan, Us
    • Installed/Configured/Maintained Apache Hadoop clusters for application development based on the requirements.• Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in java and Scala for data cleaning and preprocessing.• Involved in the development of real time streaming applications using PySpark, Apache Flink, Kafka, Hive on distributed Hadoop Cluster.• Designing and Developing Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing. • Was responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue using Python and PySpark.• Generated a script in AWS Glue to transfer the data and utilized AWS Glue to run ETL jobs and run aggregation on PySpark code.• Created data pipelines for different events to load the data from Dynamo DB to AWS S3 bucket and then into HDFS location.• Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift.• Developed Spark programs with Scala and applied principles of functional programming to do batch processing.• Involved in designing and deploying multi-tier applications using all the AWS services like (EC2, Route53, S3, RDS, Dynamo DB, SNS, SQS, IAM) focusing on high-availability, fault tolerance, and auto-scaling in AWS Cloud Formation.• Developed and deployed stacks using AWS Cloud Formation Templates (CFT) and AWS Terraform.• Developed stored procedures/views in Snowflake and use in Talend for loading Dimensions and Facts.• Install and configure Apache Airflow for S3 bucket and Snowflake data warehouse and created dags to run the Airflow.• Create develop and test environments of different applications by provisioning Kubernetes clusters on AWS using Docker, Ansible, and Terraform.• Supporting Continuous storage in AWS using Elastic Block Storage, S3, Glacier. Created Volumes and configured Snapshots for EC2 instances.
  • Cummins Inc.
    Big Data Developer
    Cummins Inc. Feb 2017 - Apr 2018
    Columbus, Indiana, Us
    • Responsibilities include gathering business requirements, developing strategy for data cleansing and data migration, writing functional and technical specifications, creating source to target mapping, designing data profiling and data validation jobs in Informatica, and creating ETL jobs in Informatica. • Worked on Hadoop cluster which ranged from 4-8 nodes during pre-production stage, and it was sometimes extended up to 24 nodes during production. • Built APIs that will allow customer service representatives to access the data and answer queries. • Designed changes to transform current Hadoop jobs to HBase. • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files. • Created Azure SQL database, performed monitoring and restoring of Azure SQL database. Performed migration of Microsoft SQL server to Azure SQL database.• Implemented Bucketing and Partitioning using hive to assist the users with data analysis.• Design & implement migration strategies for traditional systems on Azure (Lift and shift/Azure Migrate).• Utilized Waterfall methodology for team and project management. • Used Git for version control with Data Engineer team and Data Scientists colleagues.• Hive tables were created on HDFS to store the data processed by Apache Spark on the Cloudera Hadoop Cluster in Parquet format. • Import, clean, filter and analyze data using tools such as SQL, HIVE and PIG. • Used Python & SAS to extract, transform & load source data from transaction systems, generated reports, insights, and key conclusions. • Designed and Developed data mapping procedures ETL-Data Extraction, Data Analysis and Loading process for integrating data using R programming. • Effectively Communicated plans, project status, project risks and project metrics to the project team planned test strategies in accordance with project scope.
  • Dhruvsoft Services Private Limited
    Data Engineer
    Dhruvsoft Services Private Limited Jun 2015 - Nov 2016
    Hyderabad, Telangana, In
    • Gathered business requirements, definition, and design of the data sourcing, worked with the data warehouse architect on the development of logical data models. • Used Micro service architecture with Spring Boot based services interacting through a combination of REST and Apache Kafka message brokers. • Designing and Developing Azure Data Factory (ADF) extensively for ingesting data from different source systems like relational and non-relational to meet business functional requirements.• Performed Regression testing for Golden Test Cases from State (end to end test cases) and automated the process using python scripts. • Developed Spark jobs using Scala for faster real-time analytics and used Spark SQL for querying. • Generated graphs and reports using ggplot package in RStudio for analytical models. Developed and implemented R and Shiny application which showcases machine learning for business forecasting. • Research on Reinforcement Learning and control (TensorFlow, Torch), and machine learning model (Scikit-learn).• Designed and developed a new solution to process the NRT data by using Azure stream analytics, Azure Event Hub and Service Bus Queue.• Performed K-means clustering, Regression and Decision Trees in R. Worked on data cleaning and reshaping, generated segmented subsets using NumPy and Pandas in Python. • engineering using pandas and NumPy packages in python and build models using deep learning frameworks. • Implemented application of various machine learning algorithms and statistical modeling like Decision Tree, Text Analytics, Sentiment Analysis, Naive Bayes, Logistic Regression and Linear Regression using Python to determine the accuracy rate of each model. • Involved in creating Hive tables, loading with data, and writing hive queries which will run internally in Map Reduce way.
  • Grapesoft Solutions
    Data Engineer
    Grapesoft Solutions Jul 2014 - May 2015
    • Responsible with ETL design (identifying the source systems, designing source to target relationships, data cleansing, data quality, creating source specifications, ETL design documents)• Extensively worked with Spark-SQL context to create data frames and datasets to pre-process the model data.• Created AWS Code Pipeline, a service that builds, tests, and deploys code every time there is a code change, based on the release process models.• Involved in designing the row key in HBase to store Text and JSON as key values in HBase table and designed row key in such a way to get/scan it in a sorted order.• Develop Data Stage jobs to cleanse, transform and load data to Data Warehouse and sequencers to encapsulate the Data Stage job flow.• Use of data transformation tools such as DTS, SSIS, Informatica or Data Stage. • Responsible for data extraction and data ingestion from different data sources into Hadoop Data Lake by creating ETL pipelines using Pig, and Hive.• AWS API Gateway protection strategies like Resource Policies, IAM, Lambda and Cognito Authentications.• Wrote Junit tests and Integration test cases for those Microservices.• Developed NiFi workflow to pick up the multiple files from ftp location and move those to HDFS on daily basis.

Frequently Asked Questions about Guru Sai

What company does Guru Sai work for?

Guru Sai works for Cognizant

What is Guru Sai's role at the current company?

Guru Sai's current role is Programmer Analyst.

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.