I am a Data Engineer with over seven years of experience in designing, developing, and maintaining data pipelines for various use cases and industries. I currently work at Walmart, where I leverage Google Cloud Platform (GCP) services and Apache Spark to process vast datasets for retail analytics and forecasting. I also have expertise in working with AWS services, Kafka, and various databases, such as Hive, Oracle, PostgreSQL, and DB2.At Walmart, I have contributed to several projects that enhanced data storage, analysis, and visualization for the business. For example, I designed and developed data pipelines for forecasting of items, club-level ranking of items, and sourcing, using GCP, Spark, Scala, and Python. I also implemented Kafka for real-time data streaming, enabling near-instantaneous data processing for different applications. I have a passion for learning new technologies and solving complex data problems. I value collaboration, innovation, and customer satisfaction, and I strive to bring diverse perspectives and experiences to the team.
-
Data EngineerCotivitiUnited States -
Data EngineerWalmart Nov 2022 - PresentBentonville, Arkansas, Us• Designed, developed, and maintained data pipelines for retail use cases, processing vast datasets for forecasting of items, club-level ranking of items, and sourcing.• Leveraged Google Cloud Platform (GCP) services, including Google Dataproc, Google Cloud Storage (GCS), and BigQuery to enhance data storage, analysis, and visualization.• Utilized Apache Spark with Scala and Python to build high-performance, distributed data processing solutions, optimizing data processing times by 40%.• Expertise in working with various databases, including Hive, Oracle, PostgreSQL, and DB2, optimizing data retrieval and storage strategies (ORC, Parquet).• Implemented Kafka for real-time data streaming, enabling near-instantaneous data processing for time-sensitive applications.• Integrated Azure services for hybrid data solutions, enhancing data accessibility and availability.• Assisted in the evaluation and selection of data engineering tools, frameworks, and technologies to keep the data stack up-to-date and aligned with industry best practices.• Create SQL, PL/SQL scripts for sourcing data, including creating tables, Materialized views, stored procedures, and loading data into the tables. -
Aws Data EngineerIqvia Feb 2021 - Nov 2022Durham, North Carolina, UsPerformed end-to-end Architecture and implementation assessment of various AWS serviceslike Amazon EMR, Redshift, S3, Athena, Glue, and Kinesis.Involved in code migration of quality monitoring tool from AWS EC2 to AWS Lambda andbuilt logical datasets to administer quality monitoring on snowflake warehouses.Created ETL jobs on AWS glue to load vendor data from different sources, transformationsinvolving data cleaning, data imputation and data mapping and storing the results into S3 buckets.The stored data was later queried using AWS Athena.Experienced in performance tuning of Spark Applications for setting right Batch Interval time,correct level of Parallelism and memory tuning.Used AWS data pipeline for Data Extraction, Transformation and Loading from homogeneousor heterogeneous data sources and built various graphs for business decision-making using Pythonmat plot library -
Data EngineerAmerican Express Dec 2019 - Jan 2021New York, Ny, UsInvolved in file movements between HDFS and AWS S3 and extensively worked with S3bucket in AWS and converted all Hadoop jobs to run in EMR by configuring the cluster according tothe data size.Worked on data pipeline creation to convert incoming data to a common format, prepare datafor analysis and visualization, Migrate between databases, share data processing logic across webapps, batch jobs, and APIs, Consume large XML, CSV, and fixed-width files and created datapipelines in Kafka to replace batch jobs with real-time data.Collected data using Spark Streaming from AWS S3 bucket in near-real-time and performsnecessary Transformations and Aggregations on the fly to build the common learner data model andpersistence the data in HDFS.Extensively used Stash Git-Bucket for Code Control and Worked on AWS Components suchas Airflow, Elastic Map Reduce (EMR), Athena and Snowflake. -
Big Data EngineerUniversal Pictures Dec 2018 - Nov 2019Universal City, Ca, UsDeveloped data pipeline using Flume to ingest data and customer histories into HDFS foranalysis.Involved in moving all log files generated from various sources to HDFS for furtherprocessing through Kafka. -
Big Data EngineerRobert Bosch Ltd Sep 2016 - Nov 2018Peterborough, England, GbExtract Real-time feed using Spark Streaming and convert it to RDD and process data in theform of Data Frame and save the data in HDFS.Worked on analyzing Hadoop clusters using different big data analytic tools including HBasedatabase and Sqoop. -
Java DeveloperAvon Technologies Pvt Ltd. Jun 2015 - Aug 2016InCoded different deployment descriptors using XML. Generated Jar files are deployed on ApacheTomcat Server.Implemented Java/J2EE Design patterns like Business Delegate and Data Transfer Object (DTO),Data Access Object.
Frequently Asked Questions about Venu Babu B
What company does Venu Babu B work for?
Venu Babu B works for Cotiviti
What is Venu Babu B's role at the current company?
Venu Babu B's current role is Data Engineer.
Free Chrome Extension
Find emails, phones & company data instantly
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial