Professional Summary:Experienced Principal Data Engineer/Architect with 12+ years of expertise in designing and implementing scalable data architectures for high- Volume environments across finance, e-commerce, media, advertising and healthcare industries. Proficient in building complex ETL pipelines and data processing systems using tools like Apache Spark, Kafka, Airflow, and AWS Glue. Extensive hands-on experience with cloud platforms including AWS, Google Cloud, and Azure, leveraging services such as Redshift, S3, BigQuery, and Azure Data Lake. Specialized in real.time data streaming using Kafka and Kinesis, optimizing data warehouses with Snowflake, Redshift, and BigQuery. Adept at working with big data technologies such as Hadoop, HDFS, and Hive to process large-scale datasets. Proven ability to integrate and manage NoSQL databases like MongoDB, Cassandra, and DynamoDB, and relational databases such as MySQL and PostgreSQL. Strong background in data governance, compliance (GDPR, CCPA), and data security with expertise in encryption and IAM practices. Key capabilities include: Architecting large-scale cloud-native data infrastructures across AWS, GCP, and Azure for both batch and real-time data processing. Designing end-to-end ETL workflows with Apache Airflow, Talend, and AWS Glue to automate data ingestion and processing pipelines. Creating data lake architectures using AWS S3, Azure Data Lake, and Google Cloud Storage for scalable, cost-effective storage solutions. Developing high-performance data warehousing solutions with Snowflake, Redshift, and BigQuery for improved query performance and reduced latency. Implementing data streaming pipelines using Apache Kafka, Kinesis, and Spark Streaming for real-time data ingestion and analytics. Expertise in data modeling, schema design, and performance tuning for relational (MySQL, PostgreSQL) and NoSQL databases (MongoDB, Cassandra). Advanced knowledge of CI/CD pipelines using Jenkins, Docker, and Kubernetes for automating deployment and monitoring of data infrastructure. Implementing best practices in data governance, security (IAM, encryption), and ensuring compliance with GDPR and CCPA regulations. Collaborating with data scientists, business intelligence teams, and stakeholders to deliver optimized data solutions for machine learning models and predictive analytics. Experienced in designing metadata management and data cataloging solutions, improving data discoverability and governance.
Listed skills include Machine Learning, Data Science, Deep Learning, Artificial Intelligence, and 15 others.