Roshan P Email and Phone Number
Roshan P is a Senior GCP Data Engineer at Chewy| Big Data | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Actively looking for new opportunities on C2C/C2H at Chewy.
-
Senior Gcp Data EngineerChewy Jul 2022 - PresentDania, Florida, United States• Extensive background in designing, developing, testing, and implementing technical solutions using a range of GCP data technologies and tools, including BigQuery, Data Proc, Data Flow, Cloud SQL, Cloud Functions, Cloud Run, Cloud Composer, Pub/Sub, and various APIs.• Have crafted data solutions within distributed microservices and full-stack systems, harnessing GCP's data technologies to create scalable and efficient architectures.• Proficient in utilizing programming languages such as Python and Java to implement data solutions on GCP, leveraging the platform's capabilities effectively.• Led performance engineering initiatives to ensure that GCP-based systems were scalable, optimized, and met performance benchmarks.• Demonstrated expertise in both cloud and on-premises technologies, with a focus on leveraging GCP's cloud technologies for data pipeline development.• Proficient in data ingestion, storage, and processing using GCP technologies like Data Proc, Data Flow, Cloud Composer, Pub/Sub, and various APIs.• Implemented best practices and optimized data solutions on GCP, ensuring efficient data processing, data governance, and data quality.• Extensively used Python libraries like NumPy, Pandas, SciPy for data wrangling and analysis, while utilizing visualization libraries like Seaborn and Matplotlib for graph plotting. Presented dashboards using BI analytics tools like Power BI.• Ensured clarity on non-functional requirements (NFR) and effectively implemented them in the data solutions developed on GCP.• Built and deployed machine learning models on GCP using services like AI Platform, facilitating seamless integration with data preprocessing, training, and serving.• Skilled at using Snowflake's multi-cluster shared data architecture to develop and deploy scalable and effective data architectures.• Demonstrated significant expertise in Snowflake data modeling and schema design, ensuring data integrity and optimal query execution. -
Gcp Data EngineerUbs Nov 2019 - Jun 2022Weehawken, New Jersey, United States• Successfully executed the migration of an entire Oracle database to BigQuery and leveraged Power BI for comprehensive reporting.• Developed data pipelines in Google Cloud Platform (GCP) using Apache Airflow, employing various Airflow operators for ETL-related tasks.• Proficient in GCP technologies, including Dataproc, Google Cloud Storage (GCS), Cloud Functions, and BigQuery.• Demonstrated experience in data movement between GCP and Azure through Azure Data Factory.• Built Power BI reports on Azure Analysis Services to enhance performance and reporting capabilities.• Utilized the Cloud Shell SDK in GCP to configure services such as Data Proc, Storage, and BigQuery.• Collaborated with the team to design and implement a framework for generating daily ad-hoc reports and extracts from enterprise data stored in BigQuery.• Coordinated with the Data Science team to implement advanced analytical models on Hadoop clusters, managing large datasets efficiently.• Created Hive SQL scripts for designing complex tables with high-performance features like partitioning, clustering, and skewing.• Executed tasks involving the retrieval of BigQuery data into Pandas or Spark data frames, enabling advanced ETL capabilities.• Utilized Google Data Catalog and other Google Cloud APIs for monitoring, querying, and billing-related analysis of BigQuery usage.• Conducted Proof of Concept (POC) work to utilize machine learning models and Cloud ML for table quality analysis in batch processes.• Possess knowledge about Cloud Dataflow and Apache Beam for data processing and streaming.• Proficient in using Cloud Shell for various tasks and deploying services in GCP.• Created authorized views in BigQuery to implement row-level security and share data with other teams.• Demonstrated expertise in designing and deploying Hadoop clusters, alongside various Big Data analytic tools, including Pig, Hive, SQOOP, Apache Spark, using Cloudera Distribution. -
Aws Data EngineerCentene Corporation May 2017 - Oct 2019St Louis, Missouri, United States• Conducted complex data transformations from various sources within AWS Redshift, unloading the resulting dataset into the HIVE/Presto stage, which is constructed on AWS Data Lake S3.• Built HIVE queries with a set of applicable parameters to load data from the HIVE/Presto Stage into the actual HIVE/Presto Target table, often facilitated through AWS RDS.• Leveraged Amazon Web Services (AWS) components, including Elastic Map Reduce (EMR), Redshift, and EC2, for efficient data processing.• Optimized AWS Redshift SQL queries by selecting appropriate distribution styles and keys for enhanced query performance.• Proficiently worked within the Hadoop framework, utilizing Hadoop Distributed File System and its components such as Pig, Hive, Sqoop, and Pyspark.• Developed Python scripts to extract data from AWS S3 and load it into SQL Server for business teams that are not exposed to the cloud.• Created Pyspark scripts that run on MSSQL tables, pushing data to big data storage within Hive Tables.• Conducted statistical analysis on healthcare data using Python and various tools.• Extensive experience in healthcare data, including the development of data pre-processing pipelines for data types like DICOM and NONDICOM images, including XRAYS and CT-SCANS.• Developed Databricks notebooks for data preparation, including data cleansing, data validation, and applied transformations as per project requirements.• Efficiently automated complex workflows using Apache Airflow, streamlining processes and reducing the need for manual intervention.• Constructed efficient data pipelines using Apache Hive, Apache Spark, Scala, and Apache Kafka, enabling seamless data processing and analysis.• Implemented data lineage tracking and metadata management within Airflow pipelines to meet auditing and governance requirements. -
Azure Data EngineerAvon Technologies (I) Private Ltd. Sep 2015 - Feb 2017Hyderabad, Telangana, India• Analyzing large amounts of data sets to determine the optimal way to aggregate and report on these data sets.• Creating Data Analysis sheet from Athena as source and implemented data profiling document in building solutions for EDW layer in Snowflake.• Building Data model using data profiling document in My SQL Work bench and gathering all details into a clear data model.• Handled different solutions in creating expressions and functions to generate output from My SQL.• Create queries and Merge scripts in Snowflake to design different implementations based on business mapping sheet.• Extracting data from Athena files which will be placed weekly in storage account and loading the files to raw layer in the form of CSV and then to stage layer which will replicate form of source.• After loading the files, implemented solution using copy activity and other activities in Azure data factory for handling multiple files to be loaded in processed layer in the form of parquet.• Created Azure Data Factory pipelines to load data from multiple files to final processed layer.• Orchestrated a Azure Data Factory frame work of pipeline to handle different files format files like excel with header, text files with different delimiters and zip files using all functionalities of copy activity like Zip, quote character and header options with the help of metadata table to send all details as parameters.• Sent mail notification to end user by creating logic apps and calling logic apps in azure data factory pipelines to get pipeline details as parameters with the help of web activity in Azure data Factory.• Implemented end-to-end data ingestion processes on Databricks, including data extraction, transformation, and loading (ETL) from diverse data sources, ensuring data consistency and accuracy• Created external tables in snowflake on top of processed layer by giving individual path details in External table creation. -
Hadoop DeveloperCouth Infotech Private Limited May 2013 - Aug 2015Hyderabad, Telangana, India• Developed Hive and Bash scripts for source data validation and transformation. Automated data loading into HDFS and Hive for pre-processing the data using One Automation. • Gather data from Data warehouses in Teradata and Snowflake. • Developed Spark/Scala, and Python for regular expression projects in the Hadoop/Hive environment. • Designed and implemented an ETL framework to load data from multiple sources into Hive and from Hive into Teradata. • Generate reports using Tableau. • Experience in building Big Data applications using Cassandra and Hadoop. • Utilized SQOOP, ETL, and Hadoop Filesystems APIs for implementing data ingestion pipelines. • Worked on Batch data of different granularity ranging from hourly, daily to weekly, and monthly. • Hands-on experience in Hadoop administration and support activities for installations and configuring Apache Big Data Tools and Hadoop clusters using Cloudera Manager. • Handled Hadoop cluster installations in various environments such as Unix, Linux, and Windows. • Assisted in upgrading, configuration, and maintenance of various Hadoop infrastructures like Ambari, PIG, and Hive. • Developing and writing SQLs and stored procedures in Teradata. Loading data into a snowflake and writing Snow SQLs scripts.• TDCH scripts for a full and incremental refresh of Hadoop tables
Roshan P Education Details
-
Bachelor'S Degree
Frequently Asked Questions about Roshan P
What company does Roshan P work for?
Roshan P works for Chewy
What is Roshan P's role at the current company?
Roshan P's current role is Senior GCP Data Engineer at Chewy| Big Data | Python | Azure | Pyspark | Spark SQL | Azure Databrick| Hadoop | Snow flake| ETL | SQL | Airflow | Agile | Actively looking for new opportunities on C2C/C2H.
What schools did Roshan P attend?
Roshan P attended Jntuh College Of Engineering Hyderabad.
Who are Roshan P's colleagues?
Roshan P's colleagues are Walter Bratcher, Jeremy Terhaar, Jorge S. Sanchez, Jason S. Morga, Phr, Cdmp, Amy Bernheim, Logan Cratic, Carl Bricken.
Not the Roshan P you were looking for?
-
1pylonhq.com
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial