Cloud Data Engineer & Architect with over 10 years of experience, specializing in building and optimizing data platforms using Azure, Databricks, Snowflake, KAFKA and DBT. Skilled in designing end-to-end data pipelines, processing petabyte-scale datasets, and managing real-time streaming and batch data. Adept at ensuring data governance and regulatory compliance using Unity Catalog.Azure:• Azure Data Engineering & Architecture: Proficient in Azure Data Factory, Azure Synapse Analytics, Azure Data Lake, Azure SQL, Azure Event Hub, and Azure Functions for scalable, secure, and performant data solutions. Expertise in ETL/ELT processes and real-time data streaming.• Azure Event Hub & Streaming Data: Skilled in implementing real-time streaming data pipelines withAzure Event Hub, ensuring low-latency and high-throughput data processing.• Azure Synapse & Data Warehousing: Expertise in using Azure Synapse Analytics for large-scale datawarehousing, processing, and integration with the broader Azure ecosystem.Databricks• Data Engineering & Apache Spark: Deep expertise in Databricks and PySpark for big data processing,batch and streaming optimizations, and data transformation workflows. Experience in Delta Lake for data integrity, upserts, and deduplication of large datasets..• Unity Catalog & Auto Loader: Proficient in using Unity Catalog for data governance and Databricks Auto Loader for efficient ingestion of real-time data streams.• Performance Optimization: Proven experience in optimizing Apache Spark workflows, improving system reliability and enhancing throughput using Spark SQL and DataFrame API for dynamic transformations.• CI/CD & DevOps Integration: Expertise in setting up CI/CD pipelines using Azure DevOps, ensuring robust deployment of Databricks workflows, and collaborating with cross-functional teams to streamline data engineering processes.Snowflake:• Snowflake Data Warehousing: Having knowledge in Snowflake architecture, configuration, andoptimization for high-performance data warehousing. Hands-on experience with Snowpipe, Streams,Materialized Views, and role-based access control (RBAC) for managing large datasets.• Performance Tuning & Query Optimization: Having knowledge in optimizing Snowflake queries, tuningperformance, and managing resources for scalable and efficient data processing• Data Integration & ETL: Skilled in building and maintaining ETL pipelines, ensuring seamless integration with Snowflake and other cloud platforms.
-
Senior Consultant 3Neudesic, An Ibm CompanyBengaluru, Ka, In -
Senior Consultant 3Neudesic, An Ibm Company Jun 2023 - PresentBengaluru, Karnataka, India -
Lead Data ConsultantXebia May 2022 - Jun 2023Bengaluru, Karnataka, India -
Senior Data EngineerFractal Jul 2021 - Dec 2021Bangalore Urban, Karnataka, India -
Senior Data EngineerEnquero Jul 2019 - Jul 2021Bangaon Area, India -
Senior ConsultantPolaris Consulting & Services Ltd Feb 2018 - Mar 2019Hyderabad Area, India● In preprocessing phase used spark to data cleansing and data transformation to accurate the data.● Designed and built the Reporting Application, which uses the Spark-SQL to fetch and generate reports on Hive table data.● Loaded the customer profiles data, customer information, accounts information etc. into HDFS using Spark.● Developed a robust data-pipeline to cleanse, filter, aggregate, normalize, and the data using Apache Spark.● Worked with different data sources like XML files, Json files and Oracle to load data into Hive tables.● Involved in importing data from DB2 tables into HDFS and Data warehouse-Hive using Sqoop. ● Implemented schema extraction for Parquet and Avro file Formats in Hive.● Created hive external tables for querying the data.● Implemented Partitions, Dynamic Partitions and Bucketing to retrieve the results faster. ● Involved in gathering the requirements, designing, development.● Involved in extraction, transformation and loading of data directly from different source systems like flat files into HDFS.● Involved in storing the data in the form of ORC and implemented snappy compression technique. ● Involved in importing data from Oracle and SQL tables into HDFS and Hive using Sqoop. ● For injection data from Oracle, Mysql to HDFS by using Sqoop.● Process the complex nested Json data using Dataframe API POC.● Used compression technique to compress the files before loading it to Hive.
Arun J Education Details
-
Computer Science
Frequently Asked Questions about Arun J
What company does Arun J work for?
Arun J works for Neudesic, An Ibm Company
What is Arun J's role at the current company?
Arun J's current role is Senior Consultant 3.
What schools did Arun J attend?
Arun J attended Teegala Krishna Reddy Engineering College.
Not the Arun J you were looking for?
-
1gmail.com
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial