Experienced Sr. Data Engineer with a strong background in building scalable distributed data solutions and leveraging various technologies and frameworks for data processing and analytics.At Deutsche bank, I have leveraged Spark RDDs for data processing and analysis, used Hive Context and HiveQL queries for data retrieval from Hive tables, and employed efficient data modeling techniques and partitioning in Hive for enhanced performance. I have also developed custom Scala RDDs and utilized caching strategies within Spark scripts, designed Hive Fact tables for data processing, and worked with Azure services like Azure SQL Database, Azure Data Factory, and Azure SQL Data warehouse.During my time at Nike, I have built scalable data solutions in an EMR cluster environment, utilized Kafka and Sqoop for collecting and loading data into Hadoop, and extracted real-time data feeds using Kafka and Spark Streaming. I have also utilized Spark API and Hive for data analytics, performed text analytics using Spark's in-memory computing capabilities, and migrated MapReduce programs into Spark transformations. Additionally, I have designed and implemented data warehouses and utilized tools like Tableau, Apache Solr, and Oozie for data analysis and reporting.Throughout my experience, I have focused on data security, implementing encryption techniques and authentication methods like Kerberos, LDAP, and Active Directory. I have also conducted performance tuning of Spark jobs, utilized compression techniques and file formats like Avro, Parquet, and ORC, and integrated Spark with various technologies such as Apache Kafka, Cassandra, and Solr.My technical skills include Cloudera Hadoop, Spark, Scala, Hive, Sqoop, Kafka, Databricks, SQL, PySpark, Java, and tools like Jenkins and Docker for CI/CD pipelines. I am well-versed in agile methodologies and have experience working in cross-functional teams.