As a Data Engineer with 3+ years of experience, I specialize in building and optimizing scalable Big Data pipelines, leveraging AWS cloud services and a strong command of technologies like Spark, Python, and Scala. My expertise spans across developing highly efficient ETL processes, designing distributed systems, and fine-tuning Spark applications for optimal performance.My hands-on experience includes working with big data tools like HDFS, Hive, Kafka, and a range of AWS services including EMR, S3, Redshift, and Athena. I have a solid track record of delivering end-to-end data solutions—whether it's automating AWS infrastructure, writing custom UDFs, or collaborating with data science teams to productionalize machine learning models.I thrive in troubleshooting complex data pipelines, improving performance through Spark tuning techniques (such as repartitioning, broadcast variables, caching), and optimizing Hive queries. I’m also proficient in working with structured data, advanced SQL, and different file formats like CSV, Parquet, ORC, and AVRO.Key Skills & Technologies:Big Data Tools: Spark, HDFS, Hive, KafkaProgramming: Python, Java, ScalaCloud Services: AWS EMR, S3, Redshift, AthenaETL & Data Pipelines: Data ingestion, cleansing, transformation, and aggregationSQL & HiveQL: Complex queries, UDFs, performance optimizationProject Management & CI/CD: JIRA, GIT, JenkinsI am passionate about using cutting-edge data technologies to drive business insights and solve complex problems in a scalable, efficient way. Feel free to connect with me if you’d like to discuss data solutions or potential collaborations.