Harsha . Email and Phone Number

Sr Azure Data Engineer at @ CVS Health

United States

Harsha .'s Location

United States, United States

About Harsha .

My hands-on experience with cloud platforms, particularly Azure. With a deep understanding of Azure services such as Azure Virtual Machines, Azure App Services, Azure Storage, and Azure SQL Database, I'm adept at crafting scalable and resilient solutions tailored to meet diverse business needs. Additionally, my familiarity with AWS further broadens my expertise in cloud computing. Overall 10+ years of professional experience in Information Technology and around 8+ years of expertise in BigData Technologies. • Expertise in using major components of Hadoop ecosystem components like HDFS, YARN, MapReduce, Hive, Impala, Pig, Sqoop, HBase, Spark, Spark SQL, Kafka, Spark Streaming, Flume, Oozie, Zookeeper, HueMy Azure has been characterized by a relentless pursuit of excellence, a commitment to continuous learning and innovation, and a steadfast dedication to delivering value to every project I undertake. I'm excited about the future of cloud computing and eager to continue pushing the boundaries of what's possible with Azure.

Harsha .'s Current Company Details

Cvs Health

View

Sr Azure Data Engineer

United States

Harsha . Work Experience Details

Sr Azure Data Engineer

Cvs Health

United States

View
Sr Azure Data Engineer

Cvs Health Mar 2022 - Present

United States

• Collaborated with cross-functional teams to design and implement scalable data architectures.• Evaluate client needs and translate them to business requirements thereby onboarding them onto the Hadoop ecosystem. • Extract, Transform and Load data from Sources systems to Azure data Storage Services using a combination of Azure Data Factory, T-SQL and Spark SQL. Data ingestion to Azure Services like Azure Data Lake, Azure data Lake Analytics, Azure SQL Database, Data Bricks and Azure SQL data warehouse.• Utilized Azure Databricks to configure data pipelines for data validation and profiling, reducing data anomalies• Controlling and granting database access and migrating on premise databases to Azure Data Lake store using Azure Data factory.• Designed & implemented ETL processes using Azure Data Factory, resulting in a 50% reduction in processing time.• Successfully executed the migration of on-premises systems and applications to Azure, achieving a reduction in operational costs. Achieved increase in query execution speed by eliminating redundant data • Implemented data governance policies resulting in reduction in data quality issues.• Worked on building the pipeline to copy the data from source to destination in Azure data factory.• Proficiently managed Azure Data Lakes (ADLS) and Data Lake Analytics, reducing data retrieval time by 25%.• Created dependencies of activities in Azure Data Factory.• Worked on SQL server migration to Azure cloud databases. Monitored, Produced and consumed data sets of ADF.• Writing complex SnowSQL scripts in snowflake cloud data warehouse for business analysis and reporting.• Developed snowflake procedures for executing branching and looping.• Performed data quality issue analysis using Snow SQL by building analytical warehouses on Snowflake.

View
Sr Data Engineer

Fannie Mae Jul 2020 - Jan 2022

United States

• Designed end to end scalable architecture to solve business problems using various Azure Components like HDInsight, Data Factory, Data Lake, Storage and Machine Learning Studio.• Developed and maintained data pipelines using Azure services resulting in a 40% increase in data processing speed.• Designed and implemented a data lake architecture resulting in a 25% reduction in storage costs.• Developed JSON Scripts for deploying the Pipeline in Data Factory (ADF) that processes data using the SQL Activity.• Written multiple Hive UDFS using Core Java and OOP concepts and spark functions within Python programs.• Used Azure Event Grid for managing event service that enables you to easily manage events across many different Azure services and applications.• Utilized Delta Lake for Scalable metadata handling, Streaming and batch unification.• Used Delta Lakes for time travelling as Data versioning enables rollbacks, full historical audit trails, and reproducible machine learning experiments.• Used Azure Databricks for fast, easy, and collaborative spark-based platform on Azure.• Used Databricks to integrate easily with the whole Microsoft stack.• Developed and deployed big data solutions using Hadoop and Azure HDInsight.• Spun up HDInsight clusters and used Hadoop ecosystem tools like Kafka, Spark and Databricks for real-time analytics streaming, Sqoop, pig, hive and Cosmos DB for batch jobs.• Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks.

View
Azure Engineer/ Developer

State Of Minnesota Jan 2019 - Jun 2020

United States

• Analyze, develop, and build modern data solutions with the Azure PaaS service to enable data visualization. Understand the application's current Production state and the impact of new installation on existing business processes.• Worked on migration of data from On-prem SQL server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB).• Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics. • Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks. • Pipelines were created in Azure Data Factory utilizing Linked Services/Datasets/Pipeline/ to extract, transform, and load data from many sources such as Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool, and backwards. • Used Azure ML to build, test and deploy predictive analytics solutions based on data. • Used Snowflake cloud data warehouse for integrating data from multiple source system which include nested JSON formatted data into Snowflake table. • Created Build and Release for multiple projects (modules) in production environment using Visual Studio Team Services (VSTS).

View
Aws Data Engineer

Verizon Jul 2016 - Dec 2018

United States

• Performed end-to-end Architecture and implementation assessment of various AWS services like Amazon EMR, Redshift, S3, Athena, Glue, and Kinesis. • Created AWS RDS to work as Hive Metastore and could combine EMR cluster's metadata into a single RDS, which avoids the data loss even by terminating the EMR. • Used AWS Athena extensively to ingest structured data from S3 into multiple systems, including RedShift, and to generate reports. • Created on-demand tables on S3 files using Lambda Functions and AWS Glue using Python and PySpark.• Involved in code migration of quality monitoring tool from AWS EC2 to AWS Lambda and built logical datasets to administer quality monitoring on snowflake warehouses. • Designed and implemented ETL pipelines on S3 parquet files on Data Lake using AWS Glue. • Data pipeline consists of Spark, Hive and Sqoop and custom build Input Adapters to ingest, transform and analyze user behavior (clickstream) data.• Involved in converting Hive/SQL queries into Spark transformations using Spark Data Frames and Scala.• Created monitors, alarms, notifications, and logs for Lambda functions, Glue Jobs, EC2 hosts using CloudWatch, and used AWS Glue for the data transformation, validate and data cleansing. • Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data from Kinesis in near real-time. • Implemented full CI/CD pipeline by integrating SCM (Git) with automated testing tool Gradle & Deployed using Jenkins (Declarative Pipeline) and Dockerized containers in production and engaged in few DevOps tools like Ansible, Chef, AWS CloudFormation, AWS Code pipeline, Terraform and Kubernetes.• Used AWS glue catalog with crawler to get the data from S3 and perform SQL query operations and JSON schema to define table and column mapping from S3 data to Redshift.

View
Etl Developer/ Data Engineer

Target Jan 2015 - Jun 2016

United States

• Partnered with ETL developers to ensure that data is well cleaned, and the data warehouse is up to date for reporting purpose by Pig. • Selected and generated data into csv files and stored them into AWS S3 by using AWS EC2 and then structured and stored in AWS Redshift. Deploy services on AWS and utilize step function to trigger the data pipelines.• Collecting and aggregating large amounts of log data using Apache Flume and staging in HDFS for further analysis• Created plugins to extract data from multiple sources like Apache Kafka, Database and Messaging Queues.• Ran Log aggregations, website Activity tracking and commit log for distributed system using Apache Kafka.• Developing parser and loader MapReduce application to retrieve data from HDFS and store to HBase and Hive.• Imported several transactional logs from web servers with Flume to ingest the data into HDFS.• Implemented Custom Sterilizer, interceptors to Mask, created confidential data and filter unwanted records from the event payload in Flume.• Configured, designed, implemented, and monitored Kafka cluster and connectors.• Responsible for ingesting large volumes of IOT data to Kafka.• Wrote Kafka producers to stream the data from external rest APIs to Kafka topics.• Worked with multiplexing, replicating and consolidation in Flume.• Used OOZIE operational Services for batch processing and scheduling workflows dynamically.

View
Big Data Analyst

Hsbc Aug 2013 - Nov 2014

United States

• Involved in requirement analysis, design, coding, and implementation phases of the project. • Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.• Converted existing MapReduce jobs into Spark transformations and actions using Spark RDDs, Data frames and Spark SQL APIs.• Written new spark jobs in Scala to analyze the data of the customers • Used Kafka to get data from many streaming sources into HDFS. • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis. • Experience in Hive partitioning, Bucketing and performed different types of joins on Hive tables. • Created Hive external tables to perform ETL on data that is generated on daily basics. • Written HBase bulk load jobs to load processed data to HBase tables by converting to HFiles.• Performed validation on the data ingested to filter and cleanse the data in Hive. • Created Sqoop jobs to handle incremental loads from RDBMS into HDFS and applied Spark transformations.• Developed Oozie workflows to automate and productionize the data pipelines.• Developed Sqoop import Scripts for importing reference data from Teradata. • Application was deployed on WebSphere Application Server. • Apache ANT was used for the entire build process. • JUnit was used to implement test cases for beans.

View

Harsha . Education Details

Jawaharlal Nehru Technological University

Computer Science

View

Frequently Asked Questions about Harsha .

What company does Harsha . work for?

Harsha . works for Cvs Health

What is Harsha .'s role at the current company?

Harsha .'s current role is Sr Azure Data Engineer.

What schools did Harsha . attend?

Harsha . attended Jawaharlal Nehru Technological University.

Not the Harsha . you were looking for?

Harsha -

Devops Engineer |Cloud Engineer | Sre | Cloud Infrastructure| Containers- Kubernaties | Iac-Terraform | Ansible | Cicd

Schaumburg, Il

View
Harsha _

Sdet At Apple

Austin, Texas Metropolitan Area

View
Harsha .

Sr. Java Fullstack Backend Developer With Over 10 Years Of Expertise In Tele-Come/Banking/Health Sector Applications | Java/J2Ee Expert | Aws Enhancing Enterprise Edition Services With Cutting-Edge Development Practices

United States

View
Harsha .

Full Stack Dotnet Developer At U.S. Bank

Frisco, Tx

View

View similar profiles

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles

Get direct phone numbers & mobile contacts

Access company data & employee information

Works directly on LinkedIn - no copy/paste needed

Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.

Security Check