I have over six years of experience in crafting, developing, and constructing data architecture, database systems, and ETL data pipelines, using tools such as Spark, Hadoop, Airflow, Databricks, Snowflake, and cloud services from both AWS and Azure. I am proficient in real-time data processing using Spark Streaming and Kafka, enabling rapid data analysis and insights extraction from streaming data sources. I am also skilled in Python, XML, JSON processing, and data exchange, utilizing Python OpenStack APIs for numerical analysis and interfacing with third-party APIs through REST and SOAP. I am passionate about driving data transformation and enrichment, and delivering high-quality data solutions that support the organization's goals and vision.
-
Data EngineerOlg Mar 2022 - PresentToronto, Ontario, Canada● Developed data processing ETL pipelines using PySpark, involving data reading from external sources, merging, data enrichment, and loading into Azure Sql data warehouse.● Developed PySpark scripts to process streaming data from data lakes using Spark Streaming, enabling real-time data processing capabilities, and performed data transformations, cleaning, and filtering with Spark Data Frame API to efficiently load processed data into Hive.● Leveraged Hive Meta data store backup, partitioning, and bucketing techniques to optimize Spark job performance and tuning.● Possess a strong understanding of Spark Architecture, including Spark Core, Spark SQL, Data Frames, Spark Streaming, and related components such as Driver Node, Worker Node, Stages, Executors, and Tasks.● Handled complex Hive queries, conducting table joins for extracting meaningful insights related to Spark jobs.● Converted Hive SQL queries into Spark transformations using Spark Data Frames and Python, analyzing SQL scripts to design efficient PySpark solutions. Conducted performance tuning of Spark Applications, optimizing Batch Interval time, Parallelism, and memory settings for enhanced efficiency.● Implemented hybrid connectivity between Azure and on-premises environments using virtual networks, VPN, and Express Route.● Successfully migrated SQL databases to Azure Data Lake, Azure SQL Database, Data Bricks, and Azure SQL Data Warehouse, using Apache Airflow ensuring seamless database access and migration. ● Implemented Azure Data Factory (ADF) extensively for ingesting data from different source systems like rational and unstructured data to meet business functional requirements.● Configured Snowpipe for the continuous data flow from the Azure datalake to the external stage to the Snowflake staging layer.● Implemented end-to-end data extraction, transformation, and loading (ETL) processes using Azure Data Factory (ADF) and Azure HDInsight. -
Data EngineerMotorola Solutions Sep 2020 - Feb 2022Toronto, Ontario, Canada● Developed PySpark applications for encrypting raw data using hashing algorithms on client-specified columns, ensuring data security and privacy.● Utilized Spark SQL API in PySpark for data extraction, loading, and performing SQL queries, transforming existing Hive-QL queries into Python-based transformations with optimized techniques.● Responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue with Python and Spark, streamlining data storage and access.● Leveraged Spark-Streaming APIs to perform necessary data transformations and actions on Kafka-sourced data, persisting the results into Aws Data Lake.● Developed Spark applications capable of handling data from RDBMS (MySQL, Oracle Database) and streaming sources, facilitating efficient data processing.● Led the development of a data pipeline with AWS to extract data from weblogs and store it in Amazon EMR, ensuring seamless data integration.● Implemented data warehousing solutions on Amazon Redshift, optimizing query performance and enabling data-driven decision-making through rapid data retrieval and analysis.● Developed custom Spark scripts in Python for data transformations, employing RDDs to efficiently manipulate data and perform actions on RDDs, and worked with diverse data formats such as Parquet, ORC & AVRO file formats for data import.● Developed and deployed data processing pipelines on Amazon EMR (Elastic MapReduce) to efficiently process and analyze datasets, leveraging the distributed computing and AWS infrastructure.● Implemented real-time processing and core jobs using Spark Streaming with Kafka as a data pipeline system, enabling real-time data analysis and insights.● Developed PySpark data processing tasks, including data reading from external sources, data merging, enrichment, and loading into target data destinations. -
Data AnalystBank Of America Aug 2018 - Jul 2020India● Imported Legacy data from SQL Server and Teradata into Amazon S3. ● Created consumption views on top of metrics to reduce the running time for complex queries. ● Exported Data into Snowflake by creating Staging Tables to load Data of different files from Amazon S3. ● As a part of Data Migration, wrote many SQL Scripts for Mismatch of data and worked on loading the history data from Teradata SQL to snowflake. ● Developed SQL scripts to Upload, Retrieve, Manipulate and handle sensitive data (National Provider Identifier Data I.e. Name, Address, SSN, Phone No) in Teradata, SQL Server Management Studio and Snowflake Databases for the Project ● Implemented Defect Tracking process using JIRA tool by assigning bugs to Development Team ● Involved in Functional Testing, Integration testing, Regression Testing, Smoke testing and performance Testing. Tested Hadoop Map Reduce developed in python, pig, Hive.● Incorporated predictive modeling (rule engine) to evaluate the Customer/Seller health score using Implemented Defect Tracking process using JIRA tool by assigning bugs to Development Team ● Involved in Functional Testing, Integration testing, Regression Testing, Smoke testing.● Python scripts, performed computations and integrated with the Tableau viz. ● Worked with stakeholders to communicate campaign results, strategy, issues or needs. ● Analyzed marketing campaigns from various perspectives including CTR, conversion rates, seasonal/geographical trends, search queries, landing page, conversion funnel, quality score, competitors, distribution channel, etc. to achieve maximum ROI for clients. ● Understood Business requirements to the core and came up with Test Strategy based on Business rules ● Created Metric tables, End user views in Snowflake to feed data for Tableau refresh. ● Generated Custom SQL to verify the dependency for the daily, Weekly, Monthly jobs. ● Using Nebula Metadata, registered Business and Technical Datasets for corresponding SQL scripts -
Python DeveloperHbd Finanicial Services Oct 2017 - Jul 2018India● Involved in the entire lifecycle of the projects including Design, Development, and Deployment, Testing and Implementation and support.● Developed web-based applications using PHP, XML, JSON, and MVC3, utilizing Python scripts for database updates and file manipulation, enhancing application functionality.● Built database models, APIs, and views with Python to create interactive solutions, contributing to seamless user experiences.● Proficient in designing and developing the presentation layer of web applications, leveraging technologies such as HTML, CSS, JavaScript, JQuery, AJAX, and Bootstrap, ensuring visually appealing and responsive interfaces.● Developed XML schema documents and implemented frameworks for efficient parsing of XML documents, streamlining data processing.● Utilized Python for XML and JSON processing, facilitating data exchange and seamless business logic implementation.● Worked on Python OpenStack APIs and performed numerical analysis using Numpy, effectively integrating with third-party APIs through REST and SOAP interfaces.● Managed large datasets using Pandas data frames and various Relational Databases (RDBMS) such as MySQL, Oracle, and PostgreSQL, ensuring efficient data handling.● Played a key role in implementing REST APIs in Python using micro-framework like Flask with SQLALCHEMY backend for data center resource management on OpenStack.● Collaborated in the development of applications and utilized Jenkins continuous integration tool for project deployment, effectively managing version control using GIT.● Actively participated in Agile Methodologies and SCRUM process, promoting efficient project management and successful project delivery.● Performed data Visualization on survey data using Tableau Desktop and Univariate Analysis using Python (Pandas, NumPy, Seaborn, Sklearn, and Matplotlib).
Parth Patel Education Details
-
Computer Engineering
Frequently Asked Questions about Parth Patel
What company does Parth Patel work for?
Parth Patel works for Olg
What is Parth Patel's role at the current company?
Parth Patel's current role is Data Engineer @ OLG.
What schools did Parth Patel attend?
Parth Patel attended Gujarat Technological University (Gtu).
Who are Parth Patel's colleagues?
Parth Patel's colleagues are Keisha Walker, Patricia Barbe, Shlagha Sharma, Shelley Crate, Carol Whitfield, Lana Ho, Bhavista Rambhojun.
Not the Parth Patel you were looking for?
-
Parth Patel
Greater Toronto Area, Canada -
-
Parth Patel
Recruiting Across North America | Passionate About Building High-Performing TeamsCalgary, Ab -
Parth Patel
Data Analyst | Ai & Data Analytics Specialist | Driving Business Insights With Python, Sql, And Power Bi | Pmp CertifiedMoose Jaw, Sk
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial