• Big Data Engineer with 9+ years of experience in designing, implementing, and optimizing large-scale data solutions across various industries.• Proficient in maintaining data quality through cleaning, transformation, and ensuring data integrity in relational environments. • Conducted comprehensive architecture assessments and implementations of AWS services, leveraging Hadoop ecosystems for big data processing.• Developed robust security frameworks for fine-grained access control to cloud storage objects, ensuring compliance with data privacy regulations.• Expertise in architecting and deploying Enterprise Data Lakes to support diverse use cases, including data integration, analytics, and reporting.• Implemented Kerberos authentication for secure network communication in Hadoop clusters and tested various ecosystem components.• Applied machine learning algorithms to analyze and predict outcomes from large datasets, utilizing cloud services for real-time data ingestion and storage.• Expertise in using Apache Spark and AWS EMR for large-scale data transformation and movement between various data stores and databases.• Created Lambda functions to optimize cloud resource usage and reduce costs across multiple regions.• Developed reusable frameworks for automating ETL processes from relational databases to Data Lakes, using Spark and Hive.• Skilled in data blending and preparation using Alteryx and SQL for visualization tools like Tableau.• Integrated Apache Airflow with cloud services to orchestrate and monitor multi-stage machine learning workflows.• Implemented real-time data processing systems using cloud-native services like Google Cloud Pub/Sub and Dataflow for stream processing and analysis.• Designed and deployed scalable, fault-tolerant solutions using container orchestration technologies like Kubernetes.• Implemented encryption mechanisms and ensured compliance with industry standards for handling sensitive data.• Configured monitoring and alerting systems for real-time oversight of system health, performance metrics, and critical events.• Led cross-functional teams in designing and implementing complex data solutions, conducting code reviews and sprint planning sessions.• Developed custom PySpark scripts and AWS Glue ETL jobs to automate data transformation and loading processes.