As a Lead Data Engineer with 7+ years of experience, I tackle the complexities of real-time e-commerce data, focusing on competitor insights and pricing intelligence. Our approach centers on robust data storage and sophisticated processing through upstream services. I build scalable distributed data solutions using Spark, Sqoop, Hive, and Cassandra for various clients. I am responsible for ingesting, processing, and analyzing large volumes of structured and unstructured data from multiple sources, and developing frameworks, code building, analyzing, optimizations and data quality on HDFS data lakes. I have worked on different projects that provide involving large scale data, in various forms. I am a certified azure data engineer experienced in Blob storage, data bricks, ADF, synapse analytics. i have strong experience in pyspark, SQL, intermediate Scala spark technologies.Previously at PwC, Accenture, I played a pivotal role in constructing data lakes and streaming applications, contributing to a comprehensive customer view that empowers sales teams. have a strong background in Hadoop, Map Reduce, HBase, Linux, Python, Azure, Snowflake, basic knowledge on AWS and GCP and SQL, Kafka, Airflow, ETL. I also have a B.Tech degree in Electrical, Electronics and I am passionate about learning new technologies and creating impact solutions for complex business problems. I am also a content creator and a writer, sharing my knowledge and insights with over 100K+ followers.
-
Mentor - Data EngineeringTopmate.IoBengaluru, Ka, In -
Mentor - Data EngineeringTopmate.Io Jan 2024 - PresentSan Francisco, California, Us1. Questions in data Engineering?? -- Don't worry I've got you Covered.2. Want to Move to Data Engineering?? -- I am here to help you3. Any Technical Doubts?? -- Feel Free to connect Me.I am a Data Engineer with 7 Years of Experience, Rated 4.8 Star in Topmate. -
Lead Data EngineerBrillio Dec 2023 - Jun 2024Edison, New Jersey, UsWorking on Largest E-commerce project, to get the competitor data, pricing information, sellers in real time, store it in databases and process it using upstream services. -
Big Data ConsultantPwc Jun 2022 - Apr 2024Gb• Responsible for building scalable distributed data solutions using Spark.• Ingested log files from source servers into HDFS data lakes using Sqoop.• Developed Sqoop Jobs to ingest customer and product data into HDFS data lakes.• Developed Spark streaming applications to ingest transactional data from Kafka topics into Cassandra tables in near real time.• Developed an spark application to flatten the transactional data coming from using various dimensional tables and persist on Cassandra tables.• Involved in developing framework for metadata management on HDFS data lakes.• Worked on various hive optimizations like partitioning, bucketing, vectorization, indexing and using right type of hive joins like Bucket Map Join and SMB join.• Worked with various files format like CSV, JSON, ORC, AVRO and Parquet.• Developed HQL scripts to create external tables and analyze incoming and intermediate data for analytics applications in Hive. • Optimized spark jobs using various optimization techniques like broadcasting, executor tuning, persisting etc.• Responsible for developing custom UDFs, UDAFs and UDTFs in Hive.• Analyze the tweets json data using hive SerDe API to deserialize and convert into readable format.• Orchestrating Hadoop and Spark jobs using Oozie workflow to create dependency of jobs and run multiple Jobs in sequence for processing data.• Continuous monitoring and managing the Hadoop cluster through Cloudera Manager. -
Big Data Associate ConsultantPwc Oct 2021 - May 2022GbProject Description: Provides a 360-degree view of the customer so that a Salesperson is aware of all the facts when talking to customer. This gives a much better chance to close the deal. This involves building a data lake. Data sources use Hadoop tools to transfer data to and from HDFS and some of the sources, were imported using Sqoop, then storing the raw data into HIVE tables in ORC format in order to facilitate the data scientists to perform analytics using HIVE. New use cases were developed and dumped into a NOSQL database (HBase) for further analytics.Environment: Cloudera CDH 5.4.4Roles and Responsibilities:• Developed SQOOP scripts to import the source data from Oracle database into HDFS for further processing.• Developed HIVE Script to store raw data in ORC format.• Involved in gathering requirements, designing, development and testing.• Generated reports using Hive for business requirements received on ADHOC basis -
Big Data EngineerAccenture Feb 2019 - Oct 2021Dublin 2, IeThe Project is about to handle Risk Management Team, where the Bank wanted to store, process & manage the huge amount of data in a day to day operations, collected from various sources. The system Majority checks the credibility of the customer & looks for the credit risks.Roles and Responsibilities : Ingested data from multiple sources like MySQL. Created and worked on Sqoop jobs with incremental load. Design both managed & External tables in Hive. Developed Spark Code in Scala using Spark SQL & Dataframes for optimization. Creating HBase layer for faster reporting. -
Big Data EngineerD-Vois Communications Private Limited (Formerly D-Vois Broadband Pvt. Ltd.) Nov 2017 - Jan 2019Bangalore, Karnataka, InResponsibilities: • Analyzed data using Hadoop components Hive and Pig Queries, HBase queries.• Load and transform large sets of structured, semi structured, and unstructured data using Hadoop/Big Data concepts.• Involved in loading data from the UNIX file system to HDFS.• Responsible for creating Hive tables, loading data, and writing hive queries.• Handled importing data from various data sources, performed transformations using Hive, Map Reduce/Apache Spark, and loaded data into HDFS.• Extracted the data from Oracle Database into HDFS using the Sqoop.• Exported the patterns analyzed back to Teradata using Sqoop.• Loaded data from Web servers and Teradata using Sqoop, Spark Streaming API.• Utilized Spark Streaming API to stream data from various sources. Optimized existing Scala code and improved the cluster performance. • Experience in working with Spark applications like batch interval time, level of parallelism, memory tuning to improve the processing time and efficiency.
Ajay Kadiyala Education Details
-
Siddhartha Institute Of Engineering & Technology.Electronics And Communications Engineering -
Govt. Polytechnic CollegeElectronics And Communications Engineering
Frequently Asked Questions about Ajay Kadiyala
What company does Ajay Kadiyala work for?
Ajay Kadiyala works for Topmate.io
What is Ajay Kadiyala's role at the current company?
Ajay Kadiyala's current role is Mentor - Data Engineering.
What schools did Ajay Kadiyala attend?
Ajay Kadiyala attended Siddhartha Institute Of Engineering & Technology., Govt. Polytechnic College.
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial