Rakesh G.

Rakesh G. Email and Phone Number

Senior Data Engineer | Azure certificated Data Engineer | ETL | SQL Developer | AWS Data Engineer | Scala | Snowflake | Bigdata | Hadoop | Azure Databricks | Pyspark | TeraData | "VISA SPONSORSHIP NOT REQUIRED" | @ Bank of America
charlotte, north carolina, united states
Rakesh G.'s Location
Jersey City, New Jersey, United States, United States
About Rakesh G.

Overall, 10+ years of experience as a Data Engineer with proficiency in Engineering, Design, Development, and productionalization of engineering/pipeline solutions for enterprise-level use cases.Strong professional background in solving challenging business problems and closely working with Product Teams, Business Stakeholders, Architecture teams, and end users, clients, etc.I have developed, built, and maintained database systems for over seven years, and I have also built ETL data pipelines for automated and continuous data interchange.knowledgeable about offering ETL solutions. I've worked with cloud computing platforms like AWS and Microsoft Azure & POC’s which are related to GCP.To create data pipelines and architectures, I've worked closely with Azure services including (Azure Data Factory, Blob Storage, Data bricks, Data Lake, Azure Synapse Analytics, and Azure SQL Database) etc.I also have extensive knowledge of (EC2, S3, DynamoDB, EMR, Lambda Function, Athena, Redshift, Glue, RDS) etc. and other key services offered by AWS.I have real-time experience in data wrangling, ingestion, cleaning, transformation, integration, enrichment, and validation.Furthermore, I have extensive experience working with both NoSQL databases like MongoDB and Cassandra, and RDBMS like Oracle DB, MYSQL, SQL Server, Postgre SQL, DB2 etc.I used Apache Spark and its applications, including PySpark, Spark SQL, Spark Streaming like Apache Kafka, Ni Fi etc., for efficient data transformation, analysis, and real-time processing.Big data is a major part of my skillset - I've used Hadoop Env which includes Hadoop, Spark, MapReduce, Kafka, Hive, Sqoop, HBase. Experienced in using Spark Context, Spark SQL, Data Frame, Pair RDD's, and Spark YARN.I have practical knowledge on big data programming languages like Scala. I'm proficient in BI tools like Tableau, Power BI for data visualization.I have experience with version control systems like Git and SVN.

Rakesh G.'s Current Company Details
Bank of America

Bank Of America

View
Senior Data Engineer | Azure certificated Data Engineer | ETL | SQL Developer | AWS Data Engineer | Scala | Snowflake | Bigdata | Hadoop | Azure Databricks | Pyspark | TeraData | "VISA SPONSORSHIP NOT REQUIRED" |
charlotte, north carolina, united states
Employees:
250057
Rakesh G. Work Experience Details
  • Bank Of America
    Azure Data Engineer
    Bank Of America May 2022 - Present
    New Jersey, United States
    Designed end-to-end scalable architecture to address business challenges using Azure Components.Utilized Azure Data Factory, SQL API, and MongoDB API to integrate data from multiple sources.Developed Spark applications for data extraction, transformation, and aggregation from various file formats.Designed SSIS packages to transfer data from flat files to SQL Server & checks to ensure data quality.Authored pipelines in Azure Data Factory (ADF) to extract, transform, and load data from diverse sources like Azure SQL, Blob Storage, and more.Integrated data from sources such as MongoDB, MS SQL, and Cloud DB using Azure Data Factory, SQL API, and MongoDB API.Utilized Azure Synapse Analytics for data processing and migration initiatives.Designed and developed real-time stream processing applications with Spark, Kafka, Scala, and Hive for Streaming ETL/ELT and machine learning.Partitioned and bucketed Hive tables in Parquet format with Snappy compression, loading data from Avro Hive tables.Architected scalable data processing and analytics solutions for Azure HDInsight, addressing technical feasibility, integration, and development.Utilized Azure Kubernetes Service and Docker containers for deployment, scaling, and load balancing.Implemented strategies for optimizing continuous integration, release, and deployment processes using container and virtualization techniques.Collected JSON data from HTTP sources and developed Spark APIs for data insertion and updates in Hive tables.Facilitated analytical reporting and data insights for PowerBI dashboards.Employed Git for version control and Jira for project management, efficiently tracking and resolving issues and bugs.Environment: Azure (Data Lake, Data Factory, Synapse Analytics, HDInsight, SQL Server, ML studio), PowerBI, Hive, Spark, Databricks, Python, PySpark, Scala, SQL, Sqoop, Kafka, Airflow, Oozie, HBase, Oracle, Teradata, Cassandra, MLlib, Tableau, Git, Jira.
  • Bank Of America
    Azure Data Engineer
    Bank Of America May 2022 - Present
    New Jersey, United States
  • Chevron
    Aws Data Engineer
    Chevron Nov 2021 - May 2022
    Texas, United States
    Involved in building an information pipeline and performed analysis utilizing AWS stack (EMR, EC2, S3, RDS, Lambda, Glue, SQS, and Redshift). Hands on expertise with AWS Databases such as RDS(Aurora), Redshift, DynamoDB and Elastic Cache.Developed AWS data pipeline for Data Extraction, Transformation(S3, RDS, DynamoDB, Oracle, DB2, SQL Server, Cassandra) data sources.Developing and executing a migration strategy to move Data from an Oracle DB platform to AWS Redshift. Involved in migrating data from On-prem Cloudera cluster to AWS EC2 instances deployed on EMR cluster and developed ETL pipeline to extract logs and store in AWS S3 Data Lake and further processed it using PySpark. Developing the data ingestions techniques for batch and stream processing using AWS Batch, AWS Kinesis, AWS Data Pipeline.Developing the PySpark code for AWS Glue jobs to transfer the data from HDFS to AWS S3. Load data into Amazon Redshift and use AWS Cloud Watch to collect and monitor AWS RDS instances.Created SSIS Reusable Packages to extract data from Multi formatted Flat files, Excel, XML files etc.Used Spark's in memory capacities to deal with huge datasets on S3 Data Lake. Stacked information into S3 Buckets, migrated and stacked into Hive External Tables. Developing the Spark SQL to load JSON data and create Schema RDD and loaded it into Hive Tables and handled structured data using Spark SQL.Developed Oozie work processes for planning and arranging the ETL cycle. Associated with composing Python scripts to computerize & utilizing Airflow DAGs. Scheduling & monitoring the Apache Airflow DAGs to run multiple Hive and spark jobs, which independently run with time and data availability.Environment: Hadoop, Hive, Spark, , AWS, EC2, S3, Lambda, Glue, Elasticsearch, RDS, DynamoDB, Redshift, ECS, Python, PySpark, Scala, SQL, Sqoop, Kafka, Airflow, Oozie, HBase, Oracle, Teradata, Cassandra, MLlib, Tableau, Maven, Git, Jira.
  • Canon Inc.
    Data Engineer
    Canon Inc. Mar 2018 - Oct 2021
    Melville, New York, United States
    Analyzed project requirements and formulated an architecture document for a Big Data initiative.Provided support for MapReduce Python Programs running on the cluster.Optimized performance of Amazon Redshift and Apache Hadoop Clusters, enhancing data distribution and processing efficiency.Created MapReduce applications to process Avro files, perform data calculations, and execute map side joins.Employed MapReduce programs to facilitate bulk data import into HBase and utilized Rest API to access HBase data analytics.Designed and executed strategies for Incremental Imports into Hive tables.Developed Hive tables, managed data loading, and crafted MapReduce-based internal queries.Orchestrated data collection, aggregation, and movement from servers to HDFS via Flume.Utilized Sqoop to import and export data between various Relational Data Sources such as DB2, SQL Server, Teradata and DFS.Transformed complex MapReduce programs into memory-efficient Spark processing using Transformations and actions.Conducted a Proof of Concept (POC) for IoT device data, employing Spark technology.Leveraged SCALA to store streaming data in HDFS and implement Spark for accelerated data processing.Generated RDDs and DFs for requisite input data, conducted data transformations using Spark Python.Developed Spark SQL queries, managed Data Frames, imported data, executed transformations, and performed ad/write operations, saving outcomes to HDFS output directory.Crafted Hive jobs to parse and structure logs into tabular format, optimizing querying capabilities on log data.Created PIG scripts for semi-structured data analysis and developed custom PIG Loaders and UDFs to align with business requirements. Cluster coordination services through Zookeeper.Utilized Oozie workflow engine to schedule and orchestrate ETL processes.Managed and reviewed Hadoop log files using Shell scripts.Migrated ETL jobs to perform tasks
  • Infosys
    Software Engineer
    Infosys Dec 2013 - Feb 2017
    Bengaluru, Karnataka, India
    Designed and implemented normalized database schemas, ensuring Data Integrity and efficient storage.Translated business requirements into data models, identifying relationships, constraints, and key attributes.Gathered business requirements and converted them into new T-SQL stored procedures in visual studio for database project.Performed unit tests on all code and packages.Analyzed requirement and impact by participating in Joint Application Development sessions with business client online.Performed and automated SQL Server version upgrades, patch installs and maintained Relational Databases.Performed front line code reviews for other development teams.Modified and maintained SQL Server stored procedures, views, ad-hoc queries, and SSIS packages used in the search engine optimization process.Updated existing and created new reports using Microsoft SQL Server Reporting Services. Team consisted of two developers.Implemented role-based access control and Data Encryption Mechanisms to ensure data security and compliance with industry regulations.

Rakesh G. Education Details

Frequently Asked Questions about Rakesh G.

What company does Rakesh G. work for?

Rakesh G. works for Bank Of America

What is Rakesh G.'s role at the current company?

Rakesh G.'s current role is Senior Data Engineer | Azure certificated Data Engineer | ETL | SQL Developer | AWS Data Engineer | Scala | Snowflake | Bigdata | Hadoop | Azure Databricks | Pyspark | TeraData | "VISA SPONSORSHIP NOT REQUIRED" |.

What schools did Rakesh G. attend?

Rakesh G. attended University Of Central Missouri.

Who are Rakesh G.'s colleagues?

Rakesh G.'s colleagues are Thomas Schloeder, Mba, Eric A. Walker, Olivia Lim, Rob Riechmann, Violeta Inchaustegui, Ivan Wong, Sanjeev Singh.

Not the Rakesh G. you were looking for?

  • Rakesh G.

    Head Of Data & Advanced Analytics At Miracle Software Systems, Inc.
    Greater Houston
    5
    gmail.com, gmail.com, gmail.com, sunnova.com, sunnova.com

    2 855-277XXXXX

  • Rakesh Guddemgari

    "Full Stack .Net Developer | Azure Enthusiast | Building Scalable, Secure Web Applications For Aviation & Enterprise Success" | Actively Seeking | Ready To Relocate
    United States
  • Rakesh G

    Sr Aws Devops Engineer At Cotiviti
    United States
  • Rakesh G

    United States

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.