Vikas B is a Senior Data Engineer at Nationwide.
-
Senior Data EngineerNationwideUnited States -
Senior Data EngineerExpress Scripts Jan 2023 - PresentSt Louis, Missouri, United States• Worked on multi cloud environment both AWS and GCP services.• Designed and developed batch and streaming pipelines using AWS and GCP services for different clients.• Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Apache PySpark, with Cloudera Distribution.• Worked extensively on SQL, PL/SQL, Scala and UNIX shell scripting. • Involved in Data mapping specifications to create and execute detailed system test plans. data mapping specifies what data will be extracted from an internal data warehouse, transformed, and sent to an external entity.• Experience in building and architecting multiple Data pipelines, end to end ETL and ELT process for Data ingestion and transformation in GCP.• Experience in GCP Dataproc, Dataflow, Pub Sub, GCS, Cloud functions, Big Query, Stack driver, Cloud logging, Data studio for reporting.• Build data pipelines in airflow in GCP for ETL related jobs using different airflow operators.• Experience in GCP Data proc, GCS, Cloud functions, Big Query.• Migrating an entire oracle database to Big Query and using power bi for reporting.• Used cloud shell SDK in GCP to configure the services Data Proc, Storage, Big Query• Hands of experience in GCP, Big Query, GCS bucket and Stack driver.• Used MongoDB as a data source for importing a part of client data.• Write Scala program for spark transformation in Dataproc. • Worked on POC to check various cloud offerings including Google Cloud Platform (GCP).• Developed a POC for project migration from on prem Hadoop MapR system to GCP.• Compared Self hosted Hadoop with respect to GCPs Data Proc, and explored Big Table (managed HBase) use cases, performance evolution.• Develop and deploy the outcome using spark and scala code in Hadoop cluster running on GCP.• Developed Python scripts to parse the Flat Files, CSV, XML, Scala, Terraform, JSON files and extract the data from various sources and load the data into data warehouse. -
Senior Big Data EngineerWalmart Jul 2020 - Dec 2022Rogers, Arkansas, United States• Worked in multi cloud architecture designing, building multiple Data pipelines, end to end ETL from Data ingestion and transformation in GCP and AWS data pipelines.• Implemented Big Query data process from GCP Pub/Sub topic to Bigqueryusing cloud Dataflow with python and also used rest API with python to ingest Data from other systems into BigQuery. • Executed cloud dataflow to run Data Validation between raw source file and Bigquery and integrated monitoring Bigquery, Dataproc using Stackdriver across environments.• Wrote Perl scripts covering data feed handling, implementing business logic, communicating with web services through SOAPLite module and WSDL.• Generated the data cubes using hive, Pig, JAVA Map-Reducing on provisioning Hadoop cluster in AWS. • Created S3 buckets (configure, policies, permissions) and used AWS S3 for storage and backup of Data in to AWS and AWS Glacier to store archive data.• Implemented AWS solutions using E2C, S3, RDS, EBS, Elastic Load Balancer, Glue Pipelines, Glue Crawler, Auto scaling groups, Optimized volumes, and EC2 instances and created monitors, alarms, and notifications for EC2 hosts using Cloud Watch.• Implemented AWS Step Functions to automate and orchestrate the Amazon sagemaker related tasks such as publishing data to S3, training ML model and deploying it for prediction.• Handled AWS EC2, Amazon S3, Amazon RDS, Elastic Load Balancer, Auto Scaling, CloudWatch, SNS, Amazon Lambda andStepFunctions.• Generated consumer group lags from Kafka using their API Kafka. Used for building real-time data pipelines between clusters. Extracted files from Hadoop and dropped on daily hourly basis into S3. • Authored Python (PySpark) Scripts for custom UDF's for Row/Column manipulations, merges, aggregations, stacking, data labelling and for all Cleaning and conforming tasks. • Migrated an entire oracle database to BigQuery and build Data pipelines in airflow in GCP for ETL related jobs using different airflow operators. -
Senior Data EngineerCentene Corporation Mar 2018 - Jun 2020St Louis, Missouri, United States• Migrated an existing on-premises application to Amazon Web Services (AWS) and used its services like EC2 and S3 for small data sets processing and storage, experienced in maintaining the Hadoop cluster on AWS EMR.• Developed solutions to pre-process large sets of structured, semi-structured data, with different file formats like Text, Avro, Sequence, XML, JSON, and Parquet.• Developed an ETL process in AWS Glue to migrate customer data from external data stores such as S3 into AWS Redshift. • Built pipelines to copy the data from multiple sources to destination in AWS Redshift.• Migrated the data from Redshift data warehouse to Snowflake database.• Built dimensional modelling, data vault architecture on Snowflake.• Built scalable distributed Hadoop cluster running Hortonworks Data Platform (HDP)• Developed Spark code using Scala and Spark-SQL for faster testing and processing of data and exploring of optimizing it using SparkContext, Spark-SQL, PairRDD's• Serialized JSON data and storing the data into tables using Spark SQL• Used Spark Streaming and collected data from Kafka in near-real-time and performs necessary transformations and aggregation to build the common learner data model and stores the data in NoSQL store (HBase).• Worked on Spark framework on both batch and real-time data processing• Worked in MLlib from Spark are used for predictive intelligence, customer segmentation and for smooth maintenance in Spark streaming.• Implemented and maintained Hadoop cluster on AWS EMR.• Loaded data into S3 buckets using AWS Glue and Spark.• Implemented Spark in EMR for data processing in AWS Data Lake.• Involved in designing and Developing Spark workflows using Scala to pull the data from the AWS S3 bucket.• Utilized AWS Glue for data cataloging, ETL processing, and data preparation tasks, enabling seamless integration of diverse data sources and efficient transformation of large-scale datasets. -
Data EngineerBlack Knight Feb 2016 - Feb 2018Jacksonville, Florida, United States• Created Pipelines in ADF using Linked Services/Datasets/Pipeline to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, and write-back tool.• Build a scheduling framework using python SDK to launch and collect status for all ADF jobs. The status is stored on the SQL database so that real-time dashboards could be built for visual representation of pipelines.• Developed Python UDF for handling nested JSON data from the source system and flattening to line-item level records. These flattened records are further transformed using Spark transformations for daily aggregations and reporting.• Developed data pipelines and ETL workflows in Azure Synapse Analytics using Azure Data Factory, Azure Databricks, and Azure Synapse Studio.• Built and deployed Azure Synapse Analytics pipelines and workflows using Azure DevOps and Azure Resource Manager templates.• Integrated Azure Synapse Analytics with other Azure services such as Azure Blob Storage, Azure Event Hubs, and Azure Key Vault.• Implemented SELF-HOSTED integration runtime on Windows server to establish a secure connection between the Hadoop cluster and Azure data factory to migrate data from HDFS to Azure Data Lake.• Implemented versioning through Azure GIT on Azure Data factory and scheduled all pipelines through Scheduled Triggers.• Used Kafka and Spark Streaming for data ingestion and cluster handling in real-time processing. Developed flow XML files using Apache NIFI, a workflow automation tool to ingest data into HDFS.• Involved in Designing Snowflake Schema for Data Warehouse, ODS architecture by using tools like Data Model, Erwin. -
Etl DeveloperHansa Solutions Jun 2013 - Nov 2015Hyderabad, Telangana, India• Developed Advance PL/SQL packages, procedures, triggers, functions, Indexes and Collections to implement business logic using SQL Navigator. • Generated server-side PL/SQL scripts for data manipulation and validation and materialized views for remote instances.• Created management analysis reporting using Parallel Queries, Java Stored Procedures. Participated in change and code reviews to understand the testing needs of the change components. Worked on troubleshooting defects in a timely manner.• Involved in defragmentation of tables, partitioning, compressing and indexes for improved performance and efficiency. Involved in table redesigning with implementation of Partitions and Partition Indexes to make database performance and easier to maintain.• Experience in Database Application Development, Query Optimization, Performance Tuning and DBA solutions and implementation experience in complete System Development Life Cycle.• Used Informatica Power Center Designer to analyze the source data to Extract & Transform from various source systems by incorporating business rules using different objects and functions that the tool supports.• Used Power Center Designer to create mappings and mapplets to transform the data according to the business rules.• Used various transformations like Source Qualifier, Joiner, Lookup, SQL, Router, Filter, Expression and Update Strategy etc.• Created and configured Workflows and Sessions to transport the data to target Oracle tables using Informatica Workflow Manager.• Implemented complex business rules in Informatica Power Center by creating re-usable transformations, and robust Mapplets.• Implemented performance tuning of Sources, Targets, Mappings and Sessions by identifying bottlenecks and used debuggers to debug the complex mappings and fix them.• Designed and developed Informatica Workflow to extract data from XML files and loaded it into the database.
Frequently Asked Questions about Vikas B
What company does Vikas B work for?
Vikas B works for Nationwide
What is Vikas B's role at the current company?
Vikas B's current role is Senior Data Engineer.
Who are Vikas B's colleagues?
Vikas B's colleagues are Karen Linthwaite, Cwca, Brittany Smith, Katie Dorsey, Art Arguello, Tonya Rix, Mike Lassabe, Devyn Moore.
Not the Vikas B you were looking for?
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial