Ashish J Email and Phone Number

Senior Data Engineer at @ BCBS

eagan, minnesota, united states

Ashish J's Location

United States, United States

About Ashish J

• Around 10+ years of professional experience in IT industry including designing, developing, Analysis and big data in SPARK, Hadoop, Pig and HDFS environment and experience in Python. • Highly experienced in importing and exporting data between HDFS and Relational Systems like MySQL and Teradata using Sqoop. • Hands on experience on Azure cloud services (Vnet, VM, Data lake Gen2, Synapse, Snowflake, Data Pipeline, HDInsight, CosmosDB, Data Factory, Databricks) • Knowledge on big-data database HBase and NoSQL databases MongoDB and Cassandra. • Hands-on experience in scripting skills in Python, Linux, and UNIX Shell. Thorough understanding of various bigdata components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node YARN and MapReduce programming paradigm. • Working with relative ease with different working strategies like Agile, Waterfall and Scrum methodologies. • Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, Map Reduce, Sqoop, Spark, Kafka, Yarn, Oozie, and Zookeeper, Hadoop architecture and its components. . Experience with AWS services like S3, Athena, Redshift Spectrum, Redshift, EMR, Glue, Data pipeline, step functions, cloud watch, SNS and Cloud formation.• Experienced with cloud: Hadoop-on-Azure, AWS/EMR, Cloudera Manager (also direct-Hadoop-EC2 (non EMR) • Experience in Agile Methodologies and extensively used Jira for Sprints and issue tracking. • Expertise in writing Apache Spark Streaming API on Big Data distribution in the active cluster environment. •Developed Spark RDD and Spark Data Frame API for Distributed Data Processing. • Good understanding and exposure to Python programming. •Extensive knowledge on Amazon Web Services (AWS) EC2, S3, Elastic Map Reduce (EMR) and on Snowflake, Redshift, Identity and Access Management (IAM). • Hands on experience on AWS cloud services (VPC, EC2, S3, RDS, Redshift, Data Pipeline, EMR, Dynamo, Workspaces, RDS, SNS, SQS)

Ashish J's Current Company Details

Bcbs

View

Senior Data Engineer

eagan, minnesota, united states

Employees:: 3234

Ashish J Work Experience Details

Sr Data Engineer

Bcbs Sep 2022 - Present

Responsibilities:• Developed ETL jobs on Databricks Lakehouse Delta Lake to ingest and load different Membership files, claims for various payers, providers into MDM tables thereby enhancing system performance by 25%.• Led data curation efforts aligning with star schema data model, optimizing data structure to meet business needs and facilitating accessibility for data science and analytics teams.• Enabled governance, observability, data lineage using Databricks unity catalog and implemented monitoring and alerting mechanisms to notify users in case pipelines break.• Implemented data quality checks using DBT to uphold consistency and integrity, conducting comprehensive assessments including missing value checks, row and column counts, and aggregate value validation.• Built and maintained ETL processes using Azure Datafactory and Python to extract, transform, and load data from various sources into Snowflake. • Ensured code quality and documentation excellence to enhance readability and facilitate efficient debugging processes, contributing to streamlined operations and enhanced team collaboration.• Processed avro serialized streaming data with Ksql, enabling seamless handling of member enrollment, policy card dispatch, and payments. Successfully integrated with multiple subscribers including Salesforce CCI, sales, and marketing teams, ensuring timely access to critical information.• Developed data pipelines using Azure Data Factory, including Linked Services, Datasets, and Pipelines, to extract, transform, and load data from diverse sources such as Azure SQL, Blob storage, and Azure SQL Data Warehouse.• Environment: Databricks, Azure VM, Kafka Connect, Salesforce Platform Event, confluent platform, Snowflake, Oracle, DB2, Postgresql, Key Vault, KSQLDB, K-streams, REST PROXY, Github Actions, Terraform, Schema Registry, Python, SQL, ADF, Azure Functions

View
Sr Data Engineer

Johnson & Johnson Health And Wellness Solutions, Inc. Jan 2020 - Aug 2022

Responsibilities:• Designed and implemented scalable data pipelines using Azure services such as ADLS G2, ADF, and Functions to extract, transform, and load data into Databricks data warehouse.• Developed scalable data processing pipelines using Apache Spark on Azure Databricks, handling large volumes of data and ensuring data quality throughout the pipeline. • Worked on role-based access control in Databricks to ensure data security and compliance.• Created a Data Ingestion Framework in Databricks to handle Batch Data from various file formats (XML, JSON, Avro) using Snowflake Stage and Snowflake Data pipe. • Extracted data from API endpoints into S3 and then loaded it into Snowflake, improving data availability and accessibility. • Experience in building and optimizing Databrick's warehouse structures for efficient data loading and data transformation.• Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.• Implemented monitoring and troubleshooting of the Databricks warehouse and optimized usage and costs.• Created external and permanent tables in Snowflake on Azure Datalake Gen2 and Implemented incremental data load in snowflake tables using Azure ADLS G2 staging layer.• Environment: Spark, Hadoop, YARN, Azure, HTML, Python, Data bricks, Kubernetes JDBC, TERADATA, NOSQL, Sqoop, MYSQL, Airflow

View
Aws Data Engineer

Disney General Entertainment Content Jul 2017 - Dec 2019

Responsibilities:• Designed and setup Enterprise Data Lake to provide support for various use cases including Analytics, processing, storing and Reporting of voluminous, rapidly changing data.• Maintained quality reference data in source by performing operations such as cleaning, transformation and ensuring Integrity in a relational environment by working closely with the stakeholders & solution architect.• Ingested data from Nielson to S3 for performing real-time analytics on the data and processing it to gain insights from this data.• Designed and developed a Security Framework to provide fine-grained access to objects in AWS S3 using AWS Lambda, DynamoDB.• Performed end-to-end Architecture & implementation assessment of various AWS services like Amazon EMR, Redshift, S3.• Implemented the machine learning algorithms using python to predict the quantity a user might want to order for a specific item so we can automatically suggest using kinesis firehose and S3 data lake.• Import the data from different sources like HDFS/HBase into Spark RDD and perform computations using PySpark to generate the output response.• Implemented AWS Step Functions to automate and orchestrate the Amazon SageMaker related tasks such as publishing data to S3, training ML model and deploying it for prediction.• Integrated Apache Airflow with AWS to monitor multi-stage ML workflows with the tasks running on Amazon SageMaker.• Environment: AWS EMR, S3, RDS, Redshift, Lambda, Boto3, DynamoDB, Amazon SageMaker, Apache Spark, HBase, Apache Kafka, HIVE, SQOOP, Map Reduce, Snowflake, Apache Pig, Python, SSRS, Tableau.

View
Big Data Engineer/ Hadoop Developer

Bank Of America Feb 2015 - Jul 2017

Responsibilities:• Interacted with business partners, Business Analysts and product owner to understand requirements and build scalable distributed data solutions using Hadoop ecosystem.• Worked with HIVE data warehouse infrastructure-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HQL queries.• Worked on developing ETL processes (Data Stage Open Studio) to load data from multiple data sources to HDFS using FLUME and SQOOP, and performed structural modifications using Map Reduce, HIVE.• Developing Spark scripts, UDFS using both Spark DSL and Spark SQL query for data aggregation, querying, and writing data back into RDBMS through Sqoop.• Written multiple MapReduce Jobs using Java API, Pig and Hive for data extraction, transformation and aggregation from multiple file formats including Parquet, Avro, XML, JSON, CSV, ORCFILE and other compressed file formats Codecs like gZip, Snappy, Lzo.• Strong understanding of Partitioning, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.• Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.• Developing ETL pipelines in and out of data warehouse using combination of Python and Snowflakes SnowSQL Writing SQL queries against Snowflake.• Developed data pipeline programs with Spark Scala APIs, data aggregations with Hive, and formatting data (JSON) for visualization, and generating.• Environment: AWS, Cassandra, PySpark, Apache Spark, HBase, Apache Kafka, HIVE, SQOOP, FLUME, Apache oozie, Zookeeper, ETL, UDF, Map Reduce, Snowflake, Apache Pig, Python, Java, SSRS.

View
Data Warehouse/ Etl Developer

State Of Wisconsin Feb 2013 - Jan 2015

Responsibilities:• Served as a SQL Server Analyst/Developer/DBA, using SQL Server 2012, 2015, and 2016 to optimize large-scale healthcare databases, enhancing data availability.• Designed and scheduled DTS/SSIS Packages, improving data transfer efficiency within healthcare data systems.• Updated Erwin models for Consolidated Data Store (CDS), Actuarial Data Mart (ADM), and Reference DB, aligning with evolving healthcare standards and user requirements.• Exported current Data Models as PDFs, sharing them via SharePoint for increased stakeholder accessibility.• Authored Triggers, Stored Procedures, and Functions using Transact-SQL (TSQL) to facilitate robust healthcare data operations.• Deployed scripts according to Configuration Management and Playbook requirements, maintaining seamless data operations.• Optimized data storage and access by managing Files/File groups and Table/Index associations.• Enhanced data processing speed and efficiency through query tuning and performance tuning.• Ensured data accuracy and integrity using Quality Center for defect tracking and resolution.• Safeguarded sensitive healthcare data and compliance by maintaining user roles and permissions.• Environment: Maven, CI/CD Jenkins, Tableau, JIRA, Python and UNIX Shell Scripting SQL Server 2008/2012 Enterprise Edition, SSRS, SSIS, T-SQL, Windows Server 2003, Performance Point Server 2007, Oracle 10g, visual Studio 2010.

Ashish J Education Details

Mumbai University Mumbai

Computer And Information Sciences, General

View

Frequently Asked Questions about Ashish J

What company does Ashish J work for?

Ashish J works for Bcbs

What is Ashish J's role at the current company?

Ashish J's current role is Senior Data Engineer.

What schools did Ashish J attend?

Ashish J attended Mumbai University Mumbai.

Not the Ashish J you were looking for?

Ashish J

Charlotte, Nc

View

1
gmail.com
Ashish J

Java Full Stack Developer | J2Ee, Spring, Hibernate | Jpa, Rest, Microservice, React, Agile | Scrum, Ci Cd, Jenkins, Devops | Aws | Mongodb | Mysql | Microservices | C2C | C2H Roles

Fairborn, Oh

View
Ashish J

Vice President - Global Business Strategy & Sales Head- Global Fpo

Mount Pleasant, Sc

View
Ashish J

Business Systems Analyst | Systems Analyst |

O'fallon, Mo

View

View similar profiles

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles

Get direct phone numbers & mobile contacts

Access company data & employee information

Works directly on LinkedIn - no copy/paste needed

Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.

Security Check