Vijay Gupta Email and Phone Number
o Around 15 years of total IT experience with 11 years in United States working on ETL & bigdata technologies.• Around 15 years of total IT experience with 11 years in United States working on ETL & bigdata technologies.o Strong experience designing and developing applications & features using python, spark, airflow, presto, SQL and pandas.o Architected time-series data solutions, optimized AWS resource usage, reconfigured spark cluster, achieving a 50% cost.o Designed time-series data solutions, optimized AWS resource, and reconfigured clusters, achieving a 50% cost reduction.o Design machine learning classification model, develop feature extraction, preprocessing and training pipelines, integrate with MLflow for efficient model management, and create serving models to predict solutions to complex problems.o Architecting and optimizing Spark solutions, including fine-tuning Spark code, SQL queries, and cluster configurations to drive substantial improvements in performance, efficiency and ensure optimum use of AWS resources.o Skilled in designing and developing frameworks for seamless interaction with Google Sheets, PostgreSQL, and Presto/Trino etc, simplified ingestion process by adding all the features under a framework/package. o Expert in designing robust monitoring and diagnostic solutions using AWS Cloud Watch and kubectlo Hands on experience in creating continuous, batch, generic graphs/pipelines & conduct>It using Ab Initio.o Designed and developed a machine learning classification model using scikit-learn, and integrated it into MLflow for serving.o Skilled utilized AWS services including S3, EC2, RDS, cloud watch and Amazon workspace etc.o Knowledge of large language model, Generative AI.
Nielsen
View- Website:
- nielsen.com
- Employees:
- 29140
-
Solution Engineer And Data ArchitectNielsenCharlotte, Nc, Us -
Lead Data Engineer/ Data ArchitectNielsen Dec 2021 - PresentUnited Stateso Designed and developed Airflow pipelines to process audience demographics data, Nielsen panel data using python, spark, SQL.o Building an ML classification model to predict potential solutions to upcoming airflow Dags failures based on patterns in the log.o Designed and developed pipeline using time-series data, captured AWS resource utilization such as CPU & memory utilization and utilization ratio, reconfigure underutilized clusters for numerous pipelines across organization, saved AWS cost by 50%.o Optimize multiple data pipelines (Identity, Digital ads rating, Dag rest of web and Nielsen One Alpha products) that process huge amounts of data, provided many dynamic configurations to reduce aws resources.o Created different cluster configurations such as child inheriting parent cluster, distributing clusters for parallel tasks, spawning clusters according to data size or specific parameter/’s in job requests and split loads into multiple runs. Created well tested T-shirt size spark templates. Additionally, created wiki pages to leverage these configurations.o Created Python package to interact with Google Sheets, Postgres, and presto/Trino, simplified data ingestion into MDL.o Migrated pipelines from Spark 2.4 to Spark 3.0/3.2, documenting issues & solutions in detailed wiki pages for future reference. -
Lead Data Engineer/Data ArchitectSynechron Apr 2021 - Dec 2021United Stateso Collaborated closely with Business Analysts to gather and analyze requirements.o Designed and developed pipelines to calculate features related to customer demographics and DNDB customers, which were then loaded into Elastic Search as JSON files.o Analyzed Hive and Impala tables to verify data accuracy prior to ingestion into Elastic search.o Enhanced the performance, efficiency, and AWS resource utilization of big data applications through the analysis of time-series metrics leveraging Pepperdata.o Optimized Spark cluster config and worker config according to data volume and AWS resource utilization for jobs.o Analyzed and addressed small file issues by evaluating block sizes and optimizing data pipelines to prevent the creation of excessive small files, enhancing overall processing efficiency. -
Lead Data EngineerCapgemini Aug 2019 - Apr 2021Charlotte, North Carolina, United Stateso Working closely with the Data science team for requirement gathering/ Feature model validation.o Building features data model and by using Python, Pyspark, creating a training dataset.o Creating Data Frame, RDD, and Spark SQL implementing logic to create predictor variables.o Created reusable templates for creating configuration files that are input for the Spark framework.o Writing complex hive, Impala SQL to analyze data and validate transformations. o Used Apace Drill/ Drill explorer for querying HDFS data for analysis and validating output files. o Identified areas of improvement in existing business by unearthing insights by analyzing vast amounts of data using Hive and Pyspark. o Analysis of various sources and consolidates all in data foundry. -
Sr. Data EngineerCapgemini Apr 2017 - Aug 2019Tampa, Florida, United Stateso Worked with business users and business analysts for requirements gathering and clarification.o End to End system design, development, coordination with offshore and various teams in a data warehousing environment, production implementation, and support. o Prepared ETL jobs inventory and estimated migration efforts to migrate ETL jobs into python and Spark.o Implemented a set of metadata tables to store detailed schema information includes column names, types, order, size, data integrity, data quality rules ensured accurate data.o Designed and developed a framework that performs data validation and quality checks against metadata definitions, automating schema enforcement and ensuring consistency across datasets.o Designed and developed Databricks pipelines to replicate existing ETL login using python and spark.o Worked with peer team to create Hive external tables on top of Source and target parquet files. -
Etl LeadCognizant Sep 2014 - Mar 2017Richmond, Virginia Areao Developing Ab Initio graphs for various same day and 2-day matching models.o Working on xfr's, psets and generic graphs.o Developing Unix scripting, automating various batch graphs.o Building and test auto-match rules to perform matching for any data loaded to recon.o Created automation Unix script for testing work which reduce lot of testing efforts.o Creating Control-M Jobs to automate all matching process.o Working on Ab Initio Generic Graphs, psets, xfr’s and Shell scripts to SFTP files from various DDE server’s to IDQ server.o Build Automation Script to generate report of 600+ monthly job and send notification to concerned team.o Migrated Various applications from Abinitio Version 2.15 to 3.04.o Hands on experience on version control system (EME).o Worked on Metadata management services (MDH). -
Technical Business AnalystTech Mahindra (Formerly Mahindra Satyam) Sep 2013 - Sep 2014Norwalk Cto Analyzing different sub ledgers and creating Queries to Load Data into Stage to Core to Access Layer.o Understanding the business Process definition, Risk Analysis and SDLC methodologies.o Design and develop ETL mapping for getting the source data from heterogeneous systems, transforming data as per business rules, data cleansing and error handling for data loads and validation of data.o Worked on Hyperion EPM 11 Reports.o Developed tool to scratch the various amounts which are paid off at last moment and not reflecting into reporting layer.o Analyzing different sub ledgers and creating complex SQL queries to Load Data into Stage to Core to Access Layer. -
Etl DeveloperTech Mahindra (Formerly Mahindra Satyam) Mar 2010 - Sep 2013Hyderabad Area, Indiao Involved in project from scratch and participated in requirement gathering, design and development activities.o Designed and developed of continuous and batch Ab-Initio graphs.o Developed the calculation logic of various products for the different services in pricing model.o Extracted and loaded data from various data sources like Oracle/Teradata databases, Flat Files and XMLs.o Build the UNIX shell scripting and handled the project from back-end.o Migrated project from Abinitio Version 1.3 to 3.04.o Worked on advanced features of Ab Initio such as Component Folding, Micrographs, Metaprogramming and Conduct>IT.o Migrated all automated jobs from Maestro scheduler to Autosys scheduler.
Vijay Gupta Education Details
-
Data Science -
Electrical, Electronics And Communications Engineering -
Avb Public SchoolComputer Science
Frequently Asked Questions about Vijay Gupta
What company does Vijay Gupta work for?
Vijay Gupta works for Nielsen
What is Vijay Gupta's role at the current company?
Vijay Gupta's current role is Solution Engineer and Data Architect.
What schools did Vijay Gupta attend?
Vijay Gupta attended University Of North Carolina At Charlotte, Itm University, Gwalior, Avb Public School.
Who are Vijay Gupta's colleagues?
Vijay Gupta's colleagues are Gaurav Mishra, M.s, Malik Waghu, Joe Rincones, Yusofi Yusofi, Naomi Leftwich, Fernanda Van Rankin, Alen James.
Not the Vijay Gupta you were looking for?
-
-
Vijay Gupta
San Francisco Bay Area -
Vijay Gupta
Irving, Tx2gmail.com, aethereus.com -
Vijay Gupta
Westport, Ct4terex.com, hotmail.com, stanleyblackanddecker.com, sbdinc.com13 +120321XXXXX
-
Vijay Gupta
San Francisco, Ca2yahoo.com, ibm.com6 +150843XXXXX
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial