Vijay Gupta

Vijay Gupta Email and Phone Number

Solution Engineer and Data Architect @ Nielsen
Charlotte, NC, US
Vijay Gupta's Location
Charlotte, North Carolina, United States, United States
About Vijay Gupta

o Around 15 years of total IT experience with 11 years in United States working on ETL & bigdata technologies.• Around 15 years of total IT experience with 11 years in United States working on ETL & bigdata technologies.o Strong experience designing and developing applications & features using python, spark, airflow, presto, SQL and pandas.o Architected time-series data solutions, optimized AWS resource usage, reconfigured spark cluster, achieving a 50% cost.o Designed time-series data solutions, optimized AWS resource, and reconfigured clusters, achieving a 50% cost reduction.o Design machine learning classification model, develop feature extraction, preprocessing and training pipelines, integrate with MLflow for efficient model management, and create serving models to predict solutions to complex problems.o Architecting and optimizing Spark solutions, including fine-tuning Spark code, SQL queries, and cluster configurations to drive substantial improvements in performance, efficiency and ensure optimum use of AWS resources.o Skilled in designing and developing frameworks for seamless interaction with Google Sheets, PostgreSQL, and Presto/Trino etc, simplified ingestion process by adding all the features under a framework/package. o Expert in designing robust monitoring and diagnostic solutions using AWS Cloud Watch and kubectlo Hands on experience in creating continuous, batch, generic graphs/pipelines & conduct>It using Ab Initio.o Designed and developed a machine learning classification model using scikit-learn, and integrated it into MLflow for serving.o Skilled utilized AWS services including S3, EC2, RDS, cloud watch and Amazon workspace etc.o Knowledge of large language model, Generative AI.

Vijay Gupta's Current Company Details
Nielsen

Nielsen

View
Solution Engineer and Data Architect
Charlotte, NC, US
Website:
nielsen.com
Employees:
29140
Vijay Gupta Work Experience Details
  • Nielsen
    Solution Engineer And Data Architect
    Nielsen
    Charlotte, Nc, Us
  • Nielsen
    Lead Data Engineer/ Data Architect
    Nielsen Dec 2021 - Present
    United States
    o Designed and developed Airflow pipelines to process audience demographics data, Nielsen panel data using python, spark, SQL.o Building an ML classification model to predict potential solutions to upcoming airflow Dags failures based on patterns in the log.o Designed and developed pipeline using time-series data, captured AWS resource utilization such as CPU & memory utilization and utilization ratio, reconfigure underutilized clusters for numerous pipelines across organization, saved AWS cost by 50%.o Optimize multiple data pipelines (Identity, Digital ads rating, Dag rest of web and Nielsen One Alpha products) that process huge amounts of data, provided many dynamic configurations to reduce aws resources.o Created different cluster configurations such as child inheriting parent cluster, distributing clusters for parallel tasks, spawning clusters according to data size or specific parameter/’s in job requests and split loads into multiple runs. Created well tested T-shirt size spark templates. Additionally, created wiki pages to leverage these configurations.o Created Python package to interact with Google Sheets, Postgres, and presto/Trino, simplified data ingestion into MDL.o Migrated pipelines from Spark 2.4 to Spark 3.0/3.2, documenting issues & solutions in detailed wiki pages for future reference.
  • Synechron
    Lead Data Engineer/Data Architect
    Synechron Apr 2021 - Dec 2021
    United States
    o Collaborated closely with Business Analysts to gather and analyze requirements.o Designed and developed pipelines to calculate features related to customer demographics and DNDB customers, which were then loaded into Elastic Search as JSON files.o Analyzed Hive and Impala tables to verify data accuracy prior to ingestion into Elastic search.o Enhanced the performance, efficiency, and AWS resource utilization of big data applications through the analysis of time-series metrics leveraging Pepperdata.o Optimized Spark cluster config and worker config according to data volume and AWS resource utilization for jobs.o Analyzed and addressed small file issues by evaluating block sizes and optimizing data pipelines to prevent the creation of excessive small files, enhancing overall processing efficiency.
  • Capgemini
    Lead Data Engineer
    Capgemini Aug 2019 - Apr 2021
    Charlotte, North Carolina, United States
    o Working closely with the Data science team for requirement gathering/ Feature model validation.o Building features data model and by using Python, Pyspark, creating a training dataset.o Creating Data Frame, RDD, and Spark SQL implementing logic to create predictor variables.o Created reusable templates for creating configuration files that are input for the Spark framework.o Writing complex hive, Impala SQL to analyze data and validate transformations. o Used Apace Drill/ Drill explorer for querying HDFS data for analysis and validating output files. o Identified areas of improvement in existing business by unearthing insights by analyzing vast amounts of data using Hive and Pyspark. o Analysis of various sources and consolidates all in data foundry.
  • Capgemini
    Sr. Data Engineer
    Capgemini Apr 2017 - Aug 2019
    Tampa, Florida, United States
    o Worked with business users and business analysts for requirements gathering and clarification.o End to End system design, development, coordination with offshore and various teams in a data warehousing environment, production implementation, and support. o Prepared ETL jobs inventory and estimated migration efforts to migrate ETL jobs into python and Spark.o Implemented a set of metadata tables to store detailed schema information includes column names, types, order, size, data integrity, data quality rules ensured accurate data.o Designed and developed a framework that performs data validation and quality checks against metadata definitions, automating schema enforcement and ensuring consistency across datasets.o Designed and developed Databricks pipelines to replicate existing ETL login using python and spark.o Worked with peer team to create Hive external tables on top of Source and target parquet files.
  • Cognizant
    Etl Lead
    Cognizant Sep 2014 - Mar 2017
    Richmond, Virginia Area
    o Developing Ab Initio graphs for various same day and 2-day matching models.o Working on xfr's, psets and generic graphs.o Developing Unix scripting, automating various batch graphs.o Building and test auto-match rules to perform matching for any data loaded to recon.o Created automation Unix script for testing work which reduce lot of testing efforts.o Creating Control-M Jobs to automate all matching process.o Working on Ab Initio Generic Graphs, psets, xfr’s and Shell scripts to SFTP files from various DDE server’s to IDQ server.o Build Automation Script to generate report of 600+ monthly job and send notification to concerned team.o Migrated Various applications from Abinitio Version 2.15 to 3.04.o Hands on experience on version control system (EME).o Worked on Metadata management services (MDH).
  • Tech Mahindra (Formerly Mahindra Satyam)
    Technical Business Analyst
    Tech Mahindra (Formerly Mahindra Satyam) Sep 2013 - Sep 2014
    Norwalk Ct
    o Analyzing different sub ledgers and creating Queries to Load Data into Stage to Core to Access Layer.o Understanding the business Process definition, Risk Analysis and SDLC methodologies.o Design and develop ETL mapping for getting the source data from heterogeneous systems, transforming data as per business rules, data cleansing and error handling for data loads and validation of data.o Worked on Hyperion EPM 11 Reports.o Developed tool to scratch the various amounts which are paid off at last moment and not reflecting into reporting layer.o Analyzing different sub ledgers and creating complex SQL queries to Load Data into Stage to Core to Access Layer.
  • Tech Mahindra (Formerly Mahindra Satyam)
    Etl Developer
    Tech Mahindra (Formerly Mahindra Satyam) Mar 2010 - Sep 2013
    Hyderabad Area, India
    o Involved in project from scratch and participated in requirement gathering, design and development activities.o Designed and developed of continuous and batch Ab-Initio graphs.o Developed the calculation logic of various products for the different services in pricing model.o Extracted and loaded data from various data sources like Oracle/Teradata databases, Flat Files and XMLs.o Build the UNIX shell scripting and handled the project from back-end.o Migrated project from Abinitio Version 1.3 to 3.04.o Worked on advanced features of Ab Initio such as Component Folding, Micrographs, Metaprogramming and Conduct>IT.o Migrated all automated jobs from Maestro scheduler to Autosys scheduler.

Vijay Gupta Education Details

Frequently Asked Questions about Vijay Gupta

What company does Vijay Gupta work for?

Vijay Gupta works for Nielsen

What is Vijay Gupta's role at the current company?

Vijay Gupta's current role is Solution Engineer and Data Architect.

What schools did Vijay Gupta attend?

Vijay Gupta attended University Of North Carolina At Charlotte, Itm University, Gwalior, Avb Public School.

Who are Vijay Gupta's colleagues?

Vijay Gupta's colleagues are Gaurav Mishra, M.s, Malik Waghu, Joe Rincones, Yusofi Yusofi, Naomi Leftwich, Fernanda Van Rankin, Alen James.

Not the Vijay Gupta you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.