Rishav Sarkar Email and Phone Number
As a seasoned Senior Data Engineer, I bring an extensive background in crafting sophisticated data solutions across AWS and GCP environments. My expertise spans Apache Spark, Scala, Python, Apache Airflow, Google BigQuery, Docker, Kubernetes, Apache Iceberg, and Apache Kafka. I've spearheaded the design and execution of intricate data architectures, employing Apache Spark and Scala for complex data solutions, engineering real-time processing systems with Apache Kafka, and establishing robust data ingestion frameworks. Moreover, I leverage strong Python-based data visualization skills, creating insightful representations that significantly impact decision-making processes.
Meghgen Technologies Private Limited
View- Website:
- meghgen.com
- Employees:
- 49
-
Lead Data EngineerMeghgen Technologies Private LimitedBengaluru, Ka, In -
Senior Data EngineerMeghgen Technologies Private Limited Apr 2023 - PresentBengaluru, Karnataka, India1. Designed and developed a sophisticated data integration platform using Snowflake and Apache Kafka for streaming data ingestion, coupled with dbt for data transformations, achieving near real-time analytics capabilities.2. Led the end-to-end design and implementation of a multi-terabyte enterprise data warehouse in Snowflake, incorporating advanced partitioning and clustering techniques to optimize query performance for over 100 concurrent users.3. Automated end-to-end data pipeline monitoring and alerting using a combination of Snowflake’s Information Schema, SnowAlert, and custom Python scripts, leading to a 70% reduction in pipeline downtime and faster incident resolution.4. Established data validation checkpoints within ETL workflows using Snowflake’s task and stream features, ensuring that only validated and error-free data progressed through the pipeline.5. Managed and accommodated over 100 TB of data within BigQuery, facilitating analytical queries for 50+ concurrent users, and enhancing decision-making processes.6. Designed, developed, and deployed a specialized data pipeline managing 2 TB/day from 50+ sources, achieving a 40% reduction in processing time. Leveraged Iceberg tables for data analytics, significantly enhancing read performance by 10 times, streamlining data processing, and accelerating insights retrieval.7. Designed, developed, and deployed real-time replication of database changes to Apache Iceberg tables, efficiently managing 500+ events per second without the need for Spark Kafka or a streaming platform.8. Engineered and orchestrated a robust production-level project, leveraging Spark, Kafka, and NoSQL to process 100K+ records per second. Developed Python programs to facilitate real-time message handling, email registration, and authentication processes, ensuring secure multi-producer and multi-consumer interactions. -
Software Engineer 2Dell Mar 2020 - Apr 2023Bengaluru, Karnataka1. Designed and developed a comprehensive catalog comparison tool in an AWS environment, harnessing Spark and Scala's robust functionalities. Orchestrated the process using Airflow, enabling automated comparison among e-support, platform, and SDP catalogs. Successfully identified discrepancies in software bundles (SWBs) within a dataset of 10+ million records, ensuring catalog consistency and data accuracy.2. Engineered a comprehensive framework for the seamless ingestion of diverse database sources into AWS S3.3. Processed and standardized data from 15+ different formats, ingesting 1.5 TB of data daily, supporting analytical queries efficiently.4. Established a metadata management infrastructure using AWS Glue and Trino. Cataloged ingested data, optimizing analytical query efficiency and facilitating streamlined information retrieval, improving query response times by 30%.5. Implemented optimization techniques in Spark jobs, fine-tuning through broadcasts, caching, and resource allocation adjustments. Rigorously tested and identified optimal resources for each job, resulting in approximately 15% cost savings on AWS.6. Designed, developed, and deployed a robust data ingestion architecture using Cloud Pub/Sub, enabling seamless and scalable streaming data intake at a rate of 1 TB/hour, ensuring real-time availability for downstream processes.7. Developed and Deployed Apache Kafka for real-time data processing, enabling seamless ingestion, processing, and analysis of 100 million records per batch processing cycle, optimizing system performance and enabling real-time decision-making. -
Programmer AnalystCognizant Sep 2018 - Feb 2020Bengaluru Area, India1. Developed and deployed data architecture system enabling seamless ingestion of 1 TB/day from 10+ diverse sources into AWS S3. Utilized Spark RDD for complex transformations across datasets, processing over 100 million records daily.2. Employed Python's Plotly and Matplotlib libraries to craft comprehensive data visualizations and generate insightful reports.3. Leveraged AWS Glue to efficiently inspect and analyze the ingested data, enhancing visibility and accessibility for further insights.4. Created Python-based data validation tools, ensuring data accuracy and compliance, resulting in a 90% improvement in data quality.5. Migration of Spark RDD-based code to Spark DataFrame API to optimize functionality and capitalize on robust capabilities.6. Orchestrated 15+ data pipelines efficiently with Apache Airflow, managing diverse data sources and destinations reliably. -
Internship TraineeOpentext Jan 2018 - Mar 2018Bengaluru Area, India1. Leveraged Java as the primary programming language, windows PowerShell for scripts, Git for version control, and Jenkins as the automation tool to demonstrate the steps to automate the manual parts of configuration, integration, and deployment processes of D2 (an advanced, intuitive and configurable content-centric client for Documentum).2. Developed Python automation script to automate GIT functionalities like cloning and updating the repository.3. Provided support to teams by demonstrating best practices for utilizing Git and GitHub, including branching techniques for optimal code management.4. Assisted teams and conducted demonstrations on leveraging Jenkins for implementing Continuous Integration/Continuous Delivery (CI/CD) within projects. -
Internship TraineeBengal Chemicals & Pharmaceuticals Ltd. Jul 2017 - Aug 2017Kolkata Area, IndiaDuring my internship at Bengal Chemicals & Pharmaceuticals Ltd., I had the opportunity to work on the implementation of an Enterprise Resource Planning (ERP) system, focusing on the modules related to purchase, production, and sales.
Rishav Sarkar Education Details
-
Narula Institute Of TechnologyInformation Technology -
St. Jude'S High SchoolScience -
St. Jude'S High SchoolScience
Frequently Asked Questions about Rishav Sarkar
What company does Rishav Sarkar work for?
Rishav Sarkar works for Meghgen Technologies Private Limited
What is Rishav Sarkar's role at the current company?
Rishav Sarkar's current role is Lead Data Engineer.
What schools did Rishav Sarkar attend?
Rishav Sarkar attended Narula Institute Of Technology, St. Jude's High School, St. Jude's High School.
Not the Rishav Sarkar you were looking for?
-
-
-
Rishav Sarkar
A Senior Business & Strategy Development Manager | 5 Years Of Experience | Ed-Tech Expertise | B2B Sales | Crm & Erp | Saas | Strategic Sales | Team Leadership | Revenue & Pipeline GrowthKolkata1byjus.com -
Rishav Sarkar
Pune -
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial