Ashish C Email and Phone Number
Ashish C is a Senior GCP Data Engineer at Cummins| Big Data | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Actively looking for new opportunities on C2C/C2H at Cummins Inc..
Cummins Inc.
View- Website:
- cummins.com
- Employees:
- 33030
-
Sr Gcp Data EngineerCummins Inc. Jun 2022 - PresentColumbus, Indiana, United States• Created data pipelines in Google Cloud Platform (GCP) using Apache Airflow to manage ETL jobs, employing various Airflow operators. • Enhanced existing algorithms in Hadoop by utilizing Spark, improving performance and optimization through Spark Context, Spark SQL, Data Frames, and Spark YARN. • Leveraged Spark Streaming to ingest data into a built-in ingestion platform. • Developed RESTful APIs using Python with Flask and Django frameworks, integrating various data sources such as Java, JDBC, RDBMS, shell scripting, spreadsheets, and text files. Designed a performance tracking sheet for ETL runs at different project phases and shared it with the production team. • Contributed to the identification and design of cost-effective solutions through research and evaluation of alternatives. • Designed PySpark scripts for processing and transferring files to third-party vendors on an automated basis. • Built data pipelines in GCP using Apache Airflow for ETL tasks, utilizing various Airflow operators. • Demonstrated expertise in GCP services like Dataproc, Google Cloud Storage (GCS), Cloud Functions, and BigQuery. • Managed data transfer between GCP and Azure using Azure Data Factory. • Developed Power BI reports on Azure Analysis Services to enhance performance. • Utilized the GCP Cloud Shell SDK to configure services such as Data Proc, Storage, and Big Query. • Managed continuous data transfers through Snowpipe and authored Snow SQL queries for data analysis. • Developed Spark programs in Scala to perform data transformations, create Datasets and Dataframes, and write Spark SQL queries. Additionally, worked on Spark Streaming and windowed streaming applications.• Loaded, transformed, and analyzed structured, semi-structured, and unstructured data through Hive queries. • Employed AVRO and Parquet file formats for data serialization. -
Sr Aws Data EngineerMerck Nov 2019 - May 2022Branchburg, Nj• Crafting and deploying AWS solutions involving EC2, S3, EBS, Elastic Load Balancer (ELB), and auto-scaling groups. • Establishing and constructing AWS infrastructure for a variety of resources, including VPC, EC2, S3, IAM, EBS, Security Groups, Auto Scaling, and RDS, using Cloud Formation JSON templates. • Creating AWS Cloud Formation templates tailored to generate VPCs, subnets, and NAT gateways of custom sizes to ensure the successful deployment of web applications and database templates. • Developing stored procedures in MS SQL to retrieve data from different servers using FTP and processing these files to update tables. • Conducting data analysis and profiling of source data to gain a better understanding of the data sources. • Downloading BigQuery data into Pandas or Spark data frames for advanced ETL capabilities. Executing data transformation and cleansing using SQL queries, Python, and PySpark. Crafting Hive SQL scripts to generate intricate tables with high-performance metrics such as partitioning, clustering, and skewing. • Constructing ETL pipelines using Spark and Hive to ingest data from multiple sources. Taking charge of ETL processes and data validation using SQL Server Integration Services. • Writing Python scripts to automate the identification of trends, outliers, and data anomalies and loading data from Web APIs to staging databases. • Reverse-engineering existing data models to accommodate new changes using Erwin. Generating artifacts for the data engineering team, including source-to-target mappings, data quality rules, data transformation rules, and joins. • Performing data visualization for different modules using Tableau and the ONE Click method. • Developing, deploying, and overseeing event-driven and scheduled AWS Lambda functions triggered by events on various AWS sources, including logging. • Creating dashboards in Tableau with ODBC connections from various sources like BigQuery and the Presto SQL engine. -
Aws Data EngineerChevron May 2017 - Oct 2019Santa Rosa, Nm• Utilized Apache Spark to convert unstructured data into structured data. • Implemented advanced techniques like text analytics and processing using the in-memory computing capabilities of Apache Spark, coded in Scala. • Created and managed S3 buckets, defined IAM role-based policies, and customized JSON templates. • Implemented and maintained monitoring and alerting systems for production and corporate servers/storage using AWS CloudWatch. • Launched confidential EC2 Cloud Instances through AWS Web Services, particularly Linux/Ubuntu, and configured these instances for specific applications. • Leveraged Amazon Elastic Cloud Compute (EC2) infrastructure for computational tasks and Simple Storage Service (S3) for storage. • Employed various AWS services, including AWS EMR, AWS Lambda, Amazon Redshift, AWS Glue, AWS Cloud Formation (CFT), IAM, KMS, and API Gateway. • Utilized Spark and Spark SQL written in Scala for faster data testing and processing. Transformed Hive/SQL queries into Spark transformations using Spark RDDs and Scala. Documented requirements for code implementation using Spark, Hive, HDFS, and Elastic Search. • Managed ELK (Elastic Search and Kibana) and wrote Spark scripts using the Scala shell. Employed Spark with Scala, taking advantage of Data Frames and the Spark SQL API for speedy data processing. • Experienced in building and managing Hadoop EMR clusters on AWS. • Utilized various AWS services such as VPC, EC2, S3, RDS, Redshift, Data Pipeline, EMR, Dynamo DB, Redshift, Lambda, SNS, and SQS. • Created a Kafka producer to connect to external sources and transport data to a Kafka broker. • Built pipelines, data flows, and complex data transformations and manipulations using Azure Data Factory (ADF) and PySpark with Databricks.• Managed schema changes in data streams using Kafka. • Developed new Flume agents to extract data from Kafka. -
Azure Data EngineerBrio Technologies Oct 2015 - Feb 2017Hyderabad, Telangana, India• Analyze, design, and construct modern data solutions using Azure's Platform-as-a-Service (PaaS) offerings to facilitate data visualization. • Utilized Azure Data Factory (ADF) to create pipelines with linked services, datasets, and pipelines for data extraction, transformation, and loading from various sources such as Azure SQL, Blob storage, Azure SQL Data Warehouse, and write-back tools. • Implemented and managed ETL solutions while automating operational processes. • Employed correlated and non-correlated sub-queries to address complex business queries involving multiple tables from different databases. • Developed data pipelines encompassing Spark, Hive, and custom-built input adapters for ingesting, transforming, and analyzing operational data. • Worked with Azure BLOB and Data Lake storage, loading data into Azure SQL Synapse Analytics (DW). • Designed and constructed schema data models and conducted data cleaning and preparation for XML files. • Demonstrated proficiency in developing SQL scripts for automation purposes. • Engineered highly optimized Spark applications to carry out data cleansing, validation, transformation, and summarization tasks. • Provided technical guidance for projects, ensuring completion within set timelines. • Built complex distributed systems capable of handling substantial volumes of data, including metric collection, data pipeline creation, and analytics. • Evaluated the current production state of applications and assessed the impact of new implementations on existing business processes. • Collaborated with business users to gather requirements, design visualizations, and provide training for the use of self-service BI tools. -
Data EngineerCybage Software May 2013 - Sep 2015Hyderabad, Telangana, India• Assessed Snowflake design considerations and constructed logical and physical data models to accommodate application changes. • Employed dimensional and relational data modeling, incorporating Star and Snowflake Schemas for OLTP and OLAP systems. • Devised and executed incremental data extraction jobs from DB2, loading data into Hive tables and enabling interactive report generation with Tableau through Hive Server2.• Performed data modeling and integration, connecting datasets with other dimensional tables for Tableau reporting. • Engaged in single customer view and Master Data Management (MDM) tasks, including testing MDM features and managing data in formats like Parquet and JSON files.• Developed Spark applications using Spark-SQL for data extraction, transformation, and aggregation from various file formats. • Created Spark workflows in Scala for pulling data from AWS S3 and Snowflake, applying transformations through AWS Glue scripts. • Implemented data ingestion from on-premises applications to AWS, leveraging services like Amazon Kinesis for real-time data processing and maintaining ETL/ELT jobs using Matillion. • Utilized AWS EMR for efficient data transformation and migration to and from various AWS data stores and databases, including Amazon S3 and Amazon Dynamo DB.• Managed streaming data warehousing on AWS S3 and Snowflake, integrating Apache Kafka to establish connections via Spark.• Implemented continuous integration and deployment (CI/CD) pipelines through Jenkins to automate Hadoop job deployment and managed Hadoop clusters with Cloudera. • Created data partitions for large datasets in AWS S3 and executed DDL/DML/DQL/DCL operations on partitioned data. • Scheduled jobs using Airflow scripts in Python, building Directed Acyclic Graphs (DAGs) with distinct tasks and incorporating Lambda functions. • Established a Lambda Deployment function configured to respond to events from the S3 bucket.
Ashish C Education Details
-
Bachelor'S Degree
Frequently Asked Questions about Ashish C
What company does Ashish C work for?
Ashish C works for Cummins Inc.
What is Ashish C's role at the current company?
Ashish C's current role is Senior GCP Data Engineer at Cummins| Big Data | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Actively looking for new opportunities on C2C/C2H.
What schools did Ashish C attend?
Ashish C attended Jntuh College Of Engineering Hyderabad.
Who are Ashish C's colleagues?
Ashish C's colleagues are Francisco Martinez Camacho, Rick Cassidy, Fatimah Ajasa, Felix Azael, Andriy Abbott, Gary Johnson, John Scott.
Not the Ashish C you were looking for?
-
1mahindra.com
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial