professional with 10+ years of combined experience in the fields of Data Engineer, ETL Engineer, Data Analyst, Big Data implementations and Spark technologies.Experience in Big Data ecosystems using Hadoop, Pig, Hive, HDFS, MapReduce, Sqoop, Storm, Spark, Airflow, Snowflake.Expertise in writing Map-Reduce Jobs in Python for processing large sets of structured, semi-structured and unstructured data sets and stores them in HDFS.Experienced in data manipulation using python for loading and extraction as well as with python libraries such as NumPy, SciPy and Pandas for data analysis and numerical computations
-
Big Data Developer/Data AnalyticsState Street Apr 2022 - PresentBoston, Massachusetts, UsProficient in Python, PySpark, Hadoop, and Airflow, I manage massive datasets across various formats and conduct thorough data analysis to ensure quality. I specialize in implementing and maintaining efficient Hadoop clusters, leveraging Spark for large-scale processing and analysis tasks. Implemented real-time data replication using IBM InfoSphere Data Replication, ensuring data consistency across multiple environments with minimal latency.Ensured data replication processes adhered to regulatory compliance requirements, including data auditing and tracking changes across replicated systems.Expertise in managing and optimizing IBM DB2 databases, including performance tuning, backup and recovery, and implementing advanced security features -
Big Data Developer/Data AnalyticsState Of Minnesota (It Department ) May 2020 - Apr 2022Managed Airflow clusters daily batch runs for data migration between ingestion layers for inventory forecastingOptimized replication processes by fine-tuning parameters and resource allocation, achieving significant improvements in replication speed and efficiency.Created observability metrics with Python scripts for inventory forecasting team to report the Airflow DAGs performanceWrote Airflow DAGs and Spark SQL queries and deployed them to the clusterTimed and tested DAG runs to ensure no conflicts in data movementDeveloped and managed ETL processes using IBM InfoSphere DataStage, enabling efficient data integration and transformation across various data sourcesEnsured data integrity across S3 buckets and EMR clusters with Snowflake, Hue, and ZeppelinMigrated data across EMR clusters using Spark commandsVerified and tested database schemas
-
Big Data Developer/Data AnalyticsCiti Feb 2018 - Apr 2020New York, New York, Us -
Hadoop DeveloperNavihealth Dec 2016 - Jan 2018Brentwood, Tn, Us -
Junior Software EngineerTek Optimize Software India Pvt Ltd Jun 2013 - Jul 2015
Frequently Asked Questions about Vinay R
What company does Vinay R work for?
Vinay R works for State Street
What is Vinay R's role at the current company?
Vinay R's current role is Data Engineer.
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial