Pruthvi R

Pruthvi R Email and Phone Number

Azure System Engineer at Giant Eagle | Big Data | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Lead|Actively looking for new opportunities on C2C/C2H @ Giant Eagle, Inc.
pittsburgh, pennsylvania, united states
Pruthvi R's Location
United States, United States
About Pruthvi R

Pruthvi R is a Azure System Engineer at Giant Eagle | Big Data | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Lead|Actively looking for new opportunities on C2C/C2H at Giant Eagle, Inc..

Pruthvi R's Current Company Details
Giant Eagle, Inc.

Giant Eagle, Inc.

View
Azure System Engineer at Giant Eagle | Big Data | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Lead|Actively looking for new opportunities on C2C/C2H
pittsburgh, pennsylvania, united states
Website:
gianteagle.com
Employees:
4380
Pruthvi R Work Experience Details
  • Giant Eagle, Inc.
    Aws Data Engineer
    Giant Eagle, Inc. Mar 2022 - Present
    Pittsburgh, Pennsylvania, United States
    • Working with Offshore and onsite teams for Sync up.• Using hive extensively to create a view for the feature data.• Creating and maintaining automation jobs for different Data sets.• Interacting with multiple teams understanding their business requirements for designing flexible and common component.• Installed and configured Apache airflow for workflow management and created workflows in python.• Developed pipelines for auditing the metrics of all applications using GCP Cloud functions, and Dataflow for a pilot project.• Developed end-to-end pipeline, which exports the data from parquet files in Cloud Storage to GCP Cloud SQL.• Implemented Spark SQL to access hive tables into spark for faster processing of data.• Used Hive to do transformations, joins, filter and some pre-aggregations before storing the data.• Data visualization for some data set by pyspark in Jupyter notebook.• Validating and visualizing the data in Tableau.• Created sentry policy files to provide access to the required databases and tables to view from impala to the business users in the dev, uat and prod environment.• Created and Validate the Hive views using HUE.• Configured Airflow DAG for various feeds.• Created deployment document and user manual to do validations for the dataset.• Created Data Dictionary for Universal data sets.• Working with AWS/GCP cloud using in GCP Cloud storage, Data-Proc, Data Flow, Big- Query, EMR, S3, Glacier and EC2 Instance with EMR cluster.• Working with platform and Hadoop teams closely for the needs of the team.• Using Kafka for Data ingestion for different data sets.• Experienced in importing and exporting data into HDFS and assisted in exporting analyzed data to RDBMS using SQOOP.• Was responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue using Python and PySpark.
  • Shift4 Payments Lithuania
    Senior Data Engineer
    Shift4 Payments Lithuania Apr 2020 - Feb 2022
    Maryland, United States
    • Worked on loading the data from MYSQL & Teradata to Hbase where necessary using Sqoop.• Used Sqoop for importing and exporting data from Netezza, Teradata into HDFS and Hive.• Created Teradata schemas with constraints, Created Macros in Teradata. Loaded the data using Fast load utility. Created functions and procedures in Teradata.• Creating external hive tables to store and queries the data which is loaded.• Optimizations techniques include partitioning, bucketing.• Extensively used Pyspark API, processed structured Data sources.• Using Avro file format compressed with Snappy in intermediate tables for faster processing of data.• Developed data ingestion modules (both real time and batch data load) to data into various layers inS3, Redshift and Snowflake using AWS Kinesis, AWS Glue, AWS Lambda and AWS Step Functions• Developed Spark/Scala, Python for regular expression (regex) project in the Hadoop/Hive environment with Linux/Windows for big data resources.• Data sources are extracted, transformed and loaded to generate CSV data files with Python programming and SQL queries.• Configured failure alerts and status alerts for long running jobs on Airflow.• Used parquet file format for published tables and created views on the tables.• Created sentry policy files to provide access to the required databases and tables to view from impala to the business users in the dev, uat and prod environment.• Automated the jobs with Oozie and scheduled them with Autosys.• Experience in AWS to spin up the EMR cluster to process the huge data which is stored in S3 and push it to HDFS.• Participated in a collaborative team designing software and developing a Snowflake data warehouse within AWS.• Developed DAGs in Airflow to schedule and orchestrate the multiple Spark jobs.• Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • U.S. Chamber Of Commerce
    Big Data Developer
    U.S. Chamber Of Commerce Jul 2017 - Mar 2019
    • Supported Hive Programs those are running on the cluster.• Involved in loading data from UNIX file system to HDFS.• Installed and configured Hive and also written Hive UDFs.• Involved in creating Hive tables, loading data and writing Hive queries.• Working closely with AWS to migrate the entire Data Centers to the cloud using VPC, EC2, S3, EMR, RDS, Splice Machine and DynamoDB services• Worked on data pre-processing and cleaning the data to perform feature engineering and performed data imputation techniques for the missing values in the dataset using Python.• Created Data Quality Scripts using SQL and Hive to validate successful das ta load and quality of the data. Created various types of data visualizations using Python and Tableau.• Performed end-to-end delivery of pyspark ETL pipelines on Azure-DataBricks to perform the transformation of data orchestrated via Azure Data Factory (ADF) scheduled through Azure automation accounts and trigger them using Tidal Schedular.• Hands on Experience in Oozie Job Scheduling.• Worked with big data developers, designers and scientists in troubleshooting map reduce, hive jobs and tuned them to give high performance.• Automated end to end workflow from Data preparation to presentation layer for Artist Dashboard project using Shell Scripting• Used Azure PowerShell to deploy the Azure Databricks Languages: PySpark, Scala.• Provide input into Product Management to influence feature requirements for compute, and networking in VMware cloud offering.• Developed Map reduce program which were used to extract and transform the data sets and result dataset were loaded to Cassandra.• Orchestrated Sqoop scripts, pig scripts, hive queries using Oozie workflows and sub-workflows• Conducting RCA to find out data issues and resolve production problems.• Involved in loading the created files into MongoDB for faster access of large customer base without taking performance hit.
  • Kpit Cummins Info System Limited Hinjewadi, Pune
    Hadoop Developer
    Kpit Cummins Info System Limited Hinjewadi, Pune Jan 2015 - Apr 2017
    Pune, Maharashtra, India
    • Created and maintained technical documentation for launching Cloudera Hadoop Clusters and for executing Hive queries and Pig Scripts• Implemented JMS for asynchronous auditing purposes.• Experience in Automate deployment, management and self-serve troubleshooting applications.• Define and evolve existing architecture to scale with growth data volume, users and usage.• Design and develop JAVA API (Commerce API) which provides functionality to connect to the Cassandra through Java services.• Worked on data cleaning and reshaping, generated segmented subsets using Numpy and Pandas in Python• Utilized Spark SQL API in PySpark to extract and load data and perform SQL queries.• Worked on developing Pyspark script to encrypting the raw data by using hashing algorithms concepts on client specified columns.• Responsible for managing data from multiple sources BIDW & Analytics knowledge.• Experienced in running Hadoop streaming jobs to process terabytes of xml format data.• Responsible to manage data coming from different sources.• Installed and configured Hive and also written Hive UDFs.• Experience in managing the CVS and migrating into Subversion.• Experience in defining, designing and developing Java applications, specially using Hadoop Map/Reduce by leveraging frameworks such as Cascading and Hive.• Experience in Document designs and procedures for building and managing Hadoop clusters.• Strong Experience in troubleshooting the operating system, maintaining the cluster issues and also java related bugs.• Assisted in exporting analyzed data to relational databases using Sqoop.• Experienced in importing and exporting data into HDFS and assisted in exporting analyzed data to RDBMS using SQOOP.• Developed MapReduce jobs using Java API.• Wrote MapReduce jobs using Pig Latin.• Developed workflow using Oozie for running MapReduce jobs and Hive Queries.• Worked on Cluster coordination services through Zookeeper.
  • Allvy
    Python Developer
    Allvy Jul 2013 - Dec 2015
    Hyderabad, Telangana, India
    • Involved in Design, Development and Support phases of Software Development Life Cycle (SDLC).• Developed processes, DevOps tools, automation for Jenkins based software for build system and delivering SW Builds.• Developed and tested many features in an AGILE environment using HTML5, CSS, JavaScript, jQuery, and Bootstrap.• Used Python scripts to generate various reports like transaction history, OATS, user privileges, limit rules and commission schedule reports.• Designed front end and backend of the application using Python on Django Web Framework.• Used HTML, CSS, AJAX, JSON designed and developed the user interface of the website.• Developed views and templates with Python and Django's view controller and templating language to create a user-friendly website interface.• Worked with Scrappy for web scraping to extract structured data from websites to analyze the specific data of a website and work on it.• Used SVN, CVS as version control for existing systems.• Used JIRA to maintain system protocols by writing and updating procedures and business case requirements, functional requirement specifications documents.• Implemented unit testing using PyUnit and tested several RESTful services using SOAP UI.• Develop consumer-based features and applications using Python, Django, HTML, Behavior Driven Development (BDD) and pair-programming.

Pruthvi R Education Details

Frequently Asked Questions about Pruthvi R

What company does Pruthvi R work for?

Pruthvi R works for Giant Eagle, Inc.

What is Pruthvi R's role at the current company?

Pruthvi R's current role is Azure System Engineer at Giant Eagle | Big Data | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Lead|Actively looking for new opportunities on C2C/C2H.

What schools did Pruthvi R attend?

Pruthvi R attended Srm University.

Who are Pruthvi R's colleagues?

Pruthvi R's colleagues are Jeff Mcwilliams, Patricia Woods, Steven Moffat, Devin Pavick, Sheila Stevenson, Michele Shaffer, Carole Morgan.

Not the Pruthvi R you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.