Gowtham R

Gowtham R Email and Phone Number

Senior Data Engineer at Morgan Stanley | Hive | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Actively looking for new opportunities on C2C/C2H @ Morgan Stanley
new york, new york, united states
Gowtham R's Location
Manchester, New Hampshire, United States, United States
About Gowtham R

• Over 9 years of experience in Analyzing, Designing, Developing and Implementation of data, architecture, frameworks as a Data Engineer.• Specialized in Data Warehousing, Decision support Systems and extensive experience in implementing Full Life cycle Data Warehousing Projects and in Hadoop/Big Data related technology experience in Storage, Querying, Processing, analysis of data.• Software development involving cloud computing platforms like Amazon Web Services (AWS), Azure and Google Cloud (GCP).• Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.• Build a program with Python and Apache beam and execute it in cloud Dataflow to run Data validation between raw source file and big query tables.• Knowledge in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Zookeeper and Flume.• Experience in analyzing data using HiveQL, HBase and custom Map Reduce programs.• Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems like Teradata, Oracle, SQL Server and vice - versa.• Design & implement migration strategies for traditional systems on Azure (Lift and shift/Azure Migrate, other third-party tools) worked on Azure suite: Azure SQL Database, Azure Data Lake (ADLS), Azure Data Factory (ADF) V2, Azure SQL Data Warehouse, Azure Service Bus, Azure key Vault, Azure Analysis Service (AAS), Azure Blob Storage, Azure Search, Azure App Service, AZURE data Platform Service.• Hands on experience in GCP, Big Query, GCS bucket, G - cloud function, cloud dataflow, Data Proc, Stack driver.• Developed complex mappings and load the data from various sources into the Data Warehouse, using different transformations/ Stages like Joiner, Transformer, Aggregator, Update Strategy, Rank, Lookup, Filter, Sorter, Source Qualifier, Stored Procedure transformation etc.• Implemented POC to migrate map reduce jobs into Spark transformations using Python.• Developed Apache Spark jobs using Python in a test environment for faster data processing and used Spark SQL for querying.• Experienced in Spark Core, Spark RDD, Pair RDD, Spark Deployment Architectures.• An accomplished Data Engineer experienced in ingestion, storage, querying, processing, and analysis of big data, an expert in coming up with data warehousing solutions working with a variety of database technologies.

Gowtham R's Current Company Details
Morgan Stanley

Morgan Stanley

View
Senior Data Engineer at Morgan Stanley | Hive | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Actively looking for new opportunities on C2C/C2H
new york, new york, united states
Employees:
78669
Gowtham R Work Experience Details
  • Morgan Stanley
    Gcp Data Engineer
    Morgan Stanley Jul 2023 - Present
    Chicago, Illinois, United States
    • Developed Spark programs to parse the raw data, populate staging tables, and store the refined data in partitioned tables in the Enterprise Data warehouse.• Experience in building PowerBI reports on Azure Analysis services for better performance.• Developed Streaming applications using PySpark to read from the Kafka and persist the data NoSQL databases such as HBase and Cassandra.• Implemented PySpark Scripts using SparkSQL to access hive tables into a spark for faster processing of data.• Worked on Big Data Hadoop cluster implementation and data integration in developing large-scale system software.• Migrating an entire oracle database to BigQuery and using of PowerBI for reporting.• Build data pipelines in airflow in GCP for ETl related jobs using different airflow operators.• Developed streaming and batch processing applications using PySpark to ingest data from the various sources into HDFS Data Lake.• Developed DDLs and DMLs scripts in SQL and HQL for analytics applications in RDBMS and Hive.• Developed and implemented HQL scripts to create Partitioned and Bucketed tables in Hive for optimized data access.• Used cloud shell SDK in GCP to configure the services Data Proc, Storage, BigQuery.• Written Hive UDFs to implement custom functions in the Hive for aggregations.• Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems/mainframe and vice-versa loading data into HDFS.• Monitoring YARN applications Troubleshoot and resolve cluster related system problems.• Created shell scripts to parameterize the Hive actions in Oozie workflow and for scheduling the jobs.• Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.• Worked as a key role in a team of developing an initial prototype of a NiFi big data pipeline. This pipeline demonstrated an end-to-end scenario of data ingestion, processing.• Using the NiFi tool to check whether a message reached the end system or not.
  • Ace Hardware
    Senior Data Engineer
    Ace Hardware Oct 2021 - Jun 2023
    Oak Brook, Illinois, United States
    • Import data using Sqoop to load data from Teradata to HDFS on a regular basis.• Write Hive queries for ad-hoc reporting to the business.• Participated in weekly release meetings with Technology stakeholders to identify and mitigate potential risks associated with the releases.• Implemented Responsible AWS solutions using EC2, S3, RDS, EBS, Elastic Load Balancer, and Auto scaling groups, Optimized volumes and EC2 instances.• Wrote Terraform templates for AWS Infrastructure as a code to build staging, production environments & set up build & automations for Jenkins.• Configured Elastic Load Balancers (ELB) with EC2 Auto scaling groups.• Created Amazon VPC to create public-facing subnet for web servers with internet access, and backend databases & application servers in a private-facing subnet with no Internet access.• Created AWS Launch configurations based on customized AMI and use this launch configuration to configure auto scaling groups.• Utilized Puppet for configuration management of hosted Instances within AWS Configuring and Networking of Virtual Private Cloud (VPC).• Utilized S3 bucket and Glacier for storage and backup on AWS.• Using Amazon Identity Access Management (IAM) tool created groups & permissions for users to work collaboratively.• Implemented /setup continuous project build and deployment delivery process using Subversion, Git, Jenkins, IIS, Tomcat.• Connected continuous integration system with GIT version control repository and continually build as the check-ins come from the developer.• Knowledge in build tools Ant and Maven and writing build.xml and pom.xml respectively.• Knowledge in authoring pom.xml files, performing releases with the Maven release plug-in and managing Maven repositories. Implemented Maven builds to automate JAR and WAR files.• Designed and built deployment using ANT/ Shell scripting and automate the overall process using Git and MAVEN
  • Travelport
    Senior Data Engineer
    Travelport May 2019 - Sep 2021
    Englewood, Colorado, United States
    • Worked as POC for Opta Engineer, I was responsible for overall design and implementing enterprise data migration process from Legacy Oracle/Db2 Sources to RDS-Postgres and Amazon Redshift using AWS Data Migration Service, Schema Conversion tools and migration agents.• Designed and implemented highly scalable ETL Using Matillion tool. Developed numerous Orchestration and transformation jobs and nested as master jobs Matillion.• Dockized ETL components and deployed to Data Specific ECS clusters using Jenkins/GIT. Configured ETL Services, RDS and Redshift logs to Splunk and Steel Central for enterprise monotiling.• Implemented FIPS timizing and performance tuning issues and Maintenance for Cloud Databases.• Defined and deployed monitoring, metrics, and logging systems on AWS, primary configuring CloudWatch metrics for RDS and Redshift.• Implemented Workload Management (WML) in Redshift to prioritize basic dashboard queries over more complex longer running ad-hoc queries. This allowed for a more reliable and faster reporting interface, giving sub-second query response for basic queries.• Worked on developing various spark jobs for processing parquet data files; Responsible for Designing Logical and Physical data modelling for various data sources on Amazon Redshift.• Implemented Data extracts process between CME BIC (Beneficiary cloud) and CMS RASS Products.• Optimizing /tuning and automating Redshift DW environment using AWS Utility.• For implementation in RASS Project- extensively used QLIK Replicate (Formerly Attunity) and QLIK Compose for Datawarehouse to automation Data Ingestion and Data Curation Processes.
  • Ensar Solutions Inc
    Data Engineer
    Ensar Solutions Inc Apr 2016 - Feb 2019
    Hyderabad, Telangana, India
    • Involved in implementation of the project went through several phases namely: data set analysis, preprocessing data set, user-generated data extraction, and modeling.• Participated in Data Acquisition with the Data Engineer team to extract historical and real-time data by using Sqoop, Pig, Flume, Hive, MapReduce, and HDFS.• Wrote user-defined functions (UDFs) in Hive to manipulate strings, dates, and other data.• Performed Data Cleaning, features scaling, features engineering using pandas, and NumPy packages in python.• Process Improvement: Analyzed error data of recurrent programs using Python and devised a new process to reduce the turnaround time of the problem's solutions by 60%• Worked on production data fixes by creating and testing SQL scripts.• Deep dived into complex data sets to analyze trends using Linear Regression, Logistic Regression, Decision Trees• Prepared reports using SQL and Excel to track the performance of websites and apps• Visualized data using Tableau to highlight abstract information• Applied clustering algorithms i.e. Hierarchical, K-means using Scikit, and Scipy.• Performed Data Collection, Data Cleaning, Data Visualization, and Feature Engineering using Python libraries such as Pandas, Numpy, matplotlib, and seaborn.• Optimized SQL queries for transforming raw data into MySQL with Informatica to prepare structured data for machine learning.• Used Tableau for data visualization and interactive statistical analysis.• Worked with Business Analysts to understand the user requirements, layout, and look of the interactive dashboard.• Used SSIS to create ETL packages to Validate, Extract, Transform, and Load data into Data Warehouse and Data Mart.• The lifetime values were classified based on the RFM model by using an XGBoost classifier.• Maintained and developed complex SQL queries, stored procedures, views, functions, and reports that meet customer requirements using Microsoft SQL Server
  • Cameo Corporate Services
    Data Analyst
    Cameo Corporate Services Aug 2014 - Mar 2016
    Hyderabad, Telangana, India
    • Primarily worked on a project to develop internal ETL product to handle complex and large volume healthcare claims data. Designed ETL framework and developed number of packages to Extract, Transform and Load data using SQL Server Integration Services (SSIS) into local MS SQL 2012 databases to facilitate reporting operations.• Involved in various Transformation and data cleansing activities using various Control flow and data flow tasks in SSIS packages during data migration• Applied various data transformations like Lookup, Aggregate, Sort, Multicasting, Conditional Split, Derived column etc.• Developed Mappings, Sessions, and Workflows to extract, validate, and transform data per the business rules using Informatica.• Supported Data migration projects, migrated data from SQL Server to Netezza using NZ Migrate utility.• Designed target tables as per the requirement from the reporting team and designed Extraction, Transformation and Loading (ETL) using Talend.• Worked on Netezza SQL scripts to load the data between Netezza tables.• Schedule Talend Jobs using Job Conductor (Scheduling Tool in Talend) - available in TAC.• Querying, creating stored procedures and writing complex queries and T-SQL join to address various reporting operations and ad-hoc data requests.• Performance monitoring and Optimizing Indexes tasks by using Performance Monitor, SQL Profiler, Database Tuning Advisor and Index tuning wizard.• Acted as point of contact to resolve locking/blocking and performance issues.• Wrote scripts and indexing strategy for a migration to Amazon Redshift from SQL Server and MySQL databases• Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift• Used JSON schema to define table and column mapping from S3 data to Redshift and worked on indexing and data distribution strategies optimized for sub-second query response• Worked on Dell Boomi Connectors like FTP, Mail, Database, Salesforce, Web Services Listener, HTTP Client.•

Gowtham R Education Details

Frequently Asked Questions about Gowtham R

What company does Gowtham R work for?

Gowtham R works for Morgan Stanley

What is Gowtham R's role at the current company?

Gowtham R's current role is Senior Data Engineer at Morgan Stanley | Hive | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Actively looking for new opportunities on C2C/C2H.

What schools did Gowtham R attend?

Gowtham R attended Jntuh College Of Engineering Hyderabad.

Who are Gowtham R's colleagues?

Gowtham R's colleagues are Nicholas Mastellone, Amit Kumar, Ellie Kenney, Michael Schwartzman, Isabella Diniz, Robert Gilraine, Kristina West, Qpfc™.

Not the Gowtham R you were looking for?

  • Gowtham R

    Farmington, Mi
  • Gowtham R

    Raleigh-Durham-Chapel Hill Area
  • Gowtham R

    Actively Seeking For New Opportunities | Senior Sap Functional Consultant | Sap Ewm | Sap Tm | Sap Mm | Sap Sd | Sap Pp | Sap S/4Hana | Sap Supply Chain ||
    United States
  • Gowtham R

    Data Engineer | Experienced In Building Scalable Data Solutions | Expertise In Big Data, Cloud Platforms (Aws, Gcp, Azure), And Advanced Etl Processes
    United States

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.