Dev J

Dev J Email and Phone Number

Lead Data Engineer at Bank of America @ Bank of America
charlotte, north carolina, united states
Dev J's Location
Phoenix, Arizona, United States, United States
About Dev J

 10+ years of expertise in Big Data Ecosystem - Data Acquisition, Ingestion, Modeling, Storage Analysis, Integration, and DataProcessing. Experience working with Azure Cloud, Azure Data Factory, Azure Lake Storage, Azure Synapse Analytics, Azure AnalyticalAdministrations to Ingest, Transform and consolidate Structured and Unstructured Data for downstream Use cases. Experience in Building Data Pipelines using Azure Data Factory, Azure Data bricks and loading data to Azure Data Lake, Azure SQLDatabase/Datawarehouse to control and grant user level access. Experienced in building data ingestion pipelines on Azure HDInsight spark cluster by using Azure Data Factory and Spark SQL services. Experienced in building applications with AWS services like S3, EMR, Amazon Redshift, Amazon Elastic Cloud Balancing, IAM, AutoScaling, Cloud watch, Cloud Front, SNS, SQS, SES, and Lambda. Experienced in using PowerBI, Tableau, AWS Quicksight for visualization of data and in building dashboards/reports. Experience in creating, managing, analyzing, and reporting the internal business client data using AWS services like Athena, Redshift,EMR and QuickSight. Responsible for storing data on S3 using Lambda functions and AWS Glue using PySpark. Experience working with Batch Ingestion into the platform for Snowflake consumption. Worked on distributed frameworks such as Apache Spark and Presto in Amazon EMR, Redshift and interact with data in other AWS datastorage services such as Amazon S3 and Amazon DynamoDB. Involved in the automation of daily and weekly ETL jobs using Apache Airflow. Experience in working with Microsoft SQL Server database programming and as ETL Developer using SSIS, SSRS, SSAS Experience in working on python libraries like Numpy, Pandas, and MatlabLib. Skilled in System Analysis, E-R/Dimensional Data Modeling, Database Design and implementing RDBMS specific features.  Good knowledge in converting Hive/SQL queries into PySpark transformations using Data Frames. Experience working with different file formats like Json, Avro, Parquet, CSV etc. Experience in developing applications in Spark using Python/Scala to compare the performance of Spark with Hive. Good working experience with Hive and HBase/MapRDB Integration. Experienced in developing shell scripts and Python scripts to automate Spark jobs and Hive scripts. Experience in Incident Tracking and Ticketing systems such as Jira, Service Now, and Remedy, used Git and SVN for version control.

Dev J's Current Company Details
Bank of America

Bank Of America

View
Lead Data Engineer at Bank of America
charlotte, north carolina, united states
Employees:
250057
Dev J Work Experience Details
  • Bank Of America
    Lead Data Engineer
    Bank Of America Feb 2020 - Present
    Arizona, United States
     Ingesting data from various sources both OnPrem and External sources into Azure Data Lake and increasing the availability from AzureSQL Data Warehouse. Involved in building Data pipelines using Azure Data Factory by connecting to different database sources using JDBC connectors. Good experience in processing Ingested data using Azure Data bricks. Designed and implemented data pipelines to transform retail transaction data from various sources, ensuring data accuracy, consistency,and availability for downstream analytics and reporting. Involvement in working with Azure cloud stage (HDInsight, Databricks, Data Lake, Blob, Data Factory, Synapse, SQL DB and SQL DWH). Developed API proxies, products using APIGEE platform Designed and implemented by configuring Topics in new Kafka cluster in all environment. Implemented Kafka Security Features using SSL and without Kerberos. Sticking to ANSI SQL language specification wherever possible, and providing context about similar functionality in other industry-standard engines (e.g. referencing PostgreSQL function documentation) Design and developed data loader developed using APIGEE, Denodo, Informatica. Expert in developing Data bricks notebooks for extracting data from various source systems such as DB2, Teradata, and performing datacleansing, wrangling, ETL processing, and loading it into Azure SQL DB. Installed and configured OpenShift platform in managing Docker containers and Kubernetes Clusters. Experience in data migration from Microsoft SQL server to Azure SQL database. Created Automated Databricks workflow using Python to run multiple data loads or increasing parallel processing. Built ETL data pipelines to input data from Blob storage to Azure Data Lake Gen2 using Azure Data Factory (ADF).
  • Ing
    Sr Data Engineer
    Ing Dec 2018 - Jan 2020
    Developed solutions in Databricks for Data Extraction, transformation and aggregation from multiple data sources Wrote Lambda functions in python for AWS Lambda and invoked python scripts for data transformations and analytics on large data setsin EMR clusters and AWS Kinesis data streams. Development of RESTful WS web services using Java and spring boot. Worked on ETL Migration services by developing and deploying AWS Lambda functions for generating a serverless data pipeline whichcan be written to Glue Catalog and can be queried from Athena. Analysis and development of components using java / spring, for the generation and storage of payment receipts and payrolls in pdfformat. Designed and built data processing applications using Spark on AWS EMR cluster which consumes data from AWS S3 buckets, applynecessary transformations and store the curated business datasets. Wrote and executed several complex SQL queries to join tables in AWS glue for ETL operations in Spark data frame using SparkSQL. Extensively worked on AWS S3 data transfer and AWS Redshift was used for cloud data storage. Migrated an Oracle SQL ETL to run on google cloud platform using cloud dataproc & bigquery, cloud pub/sub for triggering the airflowjobs Wrote Python scripts to migrate data into from old system to AWS Redshift. Extensively worked with Pyspark / Spark SQL for querying data frames and RDDs. Implemented and maintained the monitoring and alerting of production and corporate servers/storage using AWS Cloudwatch.  Experienced in working on different file formats like XML, JSON, AVRO and Parquet Involved in maintaining user accounts (IAM), SQS, SNS Services on AWS cloud. Used Bit Bucket to collaboratively interact with the other team members. Used cloud pub/sub and cloud functions for some specific use cases such as triggering workflows upon messag.
  • Staples
    Data Engineer
    Staples Sep 2016 - Nov 2018
    Data transformations were performed in Hive, and partitions and buckets were used to improve performance. Experienced in handling HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN, Spark and Map Reduce programming.  Configured and monitored resource utilization across the cluster using Cloudera Manager, Search, and Navigator. Create external Hive tables for consumption and store data in HDFS using the ORC, Parquet, and Avro file formats. ETL Pipelines were created using the Apache PySpark - Spark SQL and Data Frame APIs. Responsible for coding Java Batch, Restful Service, MapReduce program, Hive query's, testing, debugging, Peer code review,troubleshooting and maintain status report. Analyzed Hadoop clusters and various Big Data analytic tools such as Pig, Hive, HBase, Spark, and Sqoop. Used Sqoop to load data into the cluster from dynamically generated files and relational database management systems. Developed service classes, domain/DAOs, and controllers using JAVA/J2EE technologies Partitioning, dynamic partitions, and buckets had been implemented in HIVE. Developed HQL queries, Mappings, tables, and external tables in Hive for analysis across multiple banners, as well as worked onpartitioning, optimization, compilation, and execution. Cloudera Manager is used to continuously monitor and manage the Hadoop cluster. Migrated data successfully from on prem to AWS EMR and S3 buckets by writing shell scripts. Invoked Python scripts for data transformations on large data sets in AWS Kinesis. Mappings done with reusable components such as worklets and mapplets, as well as other transformations. Automated data movement between different components by using Apache NiFi. Loading data from multiple data sources (SQL, DB2, and Oracle) into HDFS using Sqoop and storing it in Hive tables. Migrated data from Teradata into HDFS using Sqoop.
  • Novartis
    Data Analyst
    Novartis Jul 2015 - Aug 2016
     Designed, developed, and implemented Business Intelligence reports using Tableau.  Involved in building Interactive Dashboards of monthly performance reports using PowerBI. Troubleshooting, resolving, and escalating data- related issues and validating data to improve data quality. Defined data requirements and elements used in XML transactions. Performed Unit testing of reports. Perform Visualizations using Data engine, extracts and connect them.  Extensively worked on creating views and tables with MSSQL. Used Database Engine Tuning Advisor and monitoring tools for database analysis. Experience using data mining techniques to extract information from data sets and identify correlations and patterns.
  • Sciens Technologies
    Software Engineer
    Sciens Technologies May 2014 - Apr 2015
    Hyderabad, Telangana, India
     Analyzed, implemented, and solved research problems using various numerical and stochastic methods in C++ and Python modules forcomputer experiment design and analysis. Design and develop a user interface using HTML, AJAX, CSS, Java Script. Designed and developed data management system using MySQL.  Created class diagrams/sequence diagrams using UML and Rational Rose.  Modify and execute Python / Django modules to change data format.  Hands on with accessing database objects of Django Database API's.  Written python scripts to load the data in database for parsed XML documents.  Handled all the client-side validation using JavaScript.  Expertise in writing Constraints, Indexes, Views, Stored Procedures, Cursors, Triggers, and User Defined functions.  Worked with SQL and stored procedures development on MySQL and SQLite. Participated in the creation of SOAP Web Services for transmitting and receiving data in XML format from an external interface. Frontend and backend modules are tested using Django Web Framework. NumPy was used for numerical analysis, and MatPlotLib libraries from the sci-py kit were used for data analysis and plotting. JIRA was used to track agile/scrum process and development status.

Frequently Asked Questions about Dev J

What company does Dev J work for?

Dev J works for Bank Of America

What is Dev J's role at the current company?

Dev J's current role is Lead Data Engineer at Bank of America.

Who are Dev J's colleagues?

Dev J's colleagues are Ryanne Cory, Shubham Kumar Gautam, Katie Proctor, Stephanie Rocha, Tim Landow, Muhammed Kabir Abdullahi, Neelima Kambhampati.

Not the Dev J you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.