Michael Augustin Amalraj Email and Phone Number

Lead Data Engineer at EPAM Systems | 3 x AWS Certified | CKAD | Hadoop, Spark, Kafka, Hive, ELK Stack at

Michael Augustin Amalraj's Location

India, India

About Michael Augustin Amalraj

•

Michael Augustin Amalraj's Current Company Details

Lead Data Engineer at EPAM Systems | 3 x AWS Certified | CKAD | Hadoop, Spark, Kafka, Hive, ELK Stack

Michael Augustin Amalraj Work Experience Details

Lead Data Engineer

Epam Systems Jul 2021 - May 2024

Singapore

• Managed a team of 3 data engineers in a banking regulatory project, overseeing requirement gathering, architecture design, and big data application development.• Solely developed a robust, distributed ETL framework using PySpark, adeptly handling thousands of intricate business logics configured in MariaDB, resulting in an impressive 70% reduction in development time.• Solely developed a custom scheduler application in Python to streamline the parallel execution of multiple data… Show more • Managed a team of 3 data engineers in a banking regulatory project, overseeing requirement gathering, architecture design, and big data application development.• Solely developed a robust, distributed ETL framework using PySpark, adeptly handling thousands of intricate business logics configured in MariaDB, resulting in an impressive 70% reduction in development time.• Solely developed a custom scheduler application in Python to streamline the parallel execution of multiple data pipelines across 11 sectors, ensuring the maintenance of dependencies between sectors.• Built a sophisticated data validation tool allowing for user-configured checks, guaranteeing data quality at an impeccable 100% accuracy rate.• Independently developed a data compaction utility in Spark to tackle the small file issue, optimizing big data cluster performance and yielding a substantial 40% enhancement in system performance.• Diagnosed the root cause and bottleneck of slow-running existing big data pipelines, then optimized them by rewriting inefficient code and fine-tuning configurations. This initiative led to a remarkable 60% reduction in the end-to-end data pipeline duration.• Contributed to the development of a data migration pipeline, transferring data from an in-house warehouse (ADA platform) to AWS Redshift for enhanced data analytics and visualization. Additionally, provisioned various AWS services including EC2, SQS, S3, Lambda, API Gateway, and Rekognition for a marketing campaign project on the AWS cloud. Show less

View
Senior Data Engineer

Marina Bay Sands Oct 2019 - Jul 2021

Singapore

• Crafted specialized Spark applications to meticulously clean, analyze, and generate strategic recommendations for Casino operations, leveraging historical data. This pivotal effort directly contributed to a noteworthy 20% increase in revenue. • Designed, implemented, and tested data flow and integration pipelines in Apache Hadoop using Kafka, Flume and Spark Streaming.• Constructed robust ETL pipelines utilizing Spark, Hadoop, Hive, Impala, and Kudu to seamlessly load data into the… Show more • Crafted specialized Spark applications to meticulously clean, analyze, and generate strategic recommendations for Casino operations, leveraging historical data. This pivotal effort directly contributed to a noteworthy 20% increase in revenue. • Designed, implemented, and tested data flow and integration pipelines in Apache Hadoop using Kafka, Flume and Spark Streaming.• Constructed robust ETL pipelines utilizing Spark, Hadoop, Hive, Impala, and Kudu to seamlessly load data into the data warehouse. These pipelines were instrumental in providing data to fuel machine learning jobs, facilitating credit risk calculation processes.• Streamlined diverse data pipelines through the implementation of shell script automation.• Fine-tuned and optimized the performance of big data applications and pipelines, ensuring optimal throughput and scalability to meet evolving business demands. Show less

View
Data Engineer

Crédit Agricole Cib Sep 2018 - Oct 2019

Singapore

• Designed and implemented Spark applications for calculating Market Risk according to FRTB standards, incorporating standard approaches such as DRC, RRAO, and SBM.• Boosted data application performance through optimization, achieving a notable 40% enhancement in efficiency.• Implemented Elasticsearch (ELK) stack pipelines to parse, transform, and load spark application logs in order for BAs and support team to create report and monitor the process.• Conducted thorough analysis to… Show more • Designed and implemented Spark applications for calculating Market Risk according to FRTB standards, incorporating standard approaches such as DRC, RRAO, and SBM.• Boosted data application performance through optimization, achieving a notable 40% enhancement in efficiency.• Implemented Elasticsearch (ELK) stack pipelines to parse, transform, and load spark application logs in order for BAs and support team to create report and monitor the process.• Conducted thorough analysis to identify the root cause of intermittent failures in a critical Spark application persisting for two years. Implemented necessary fixes, thereby ensuring enhanced application reliability and performance in the production environment.• Single-handedly designed and deployed a custom orchestrator application, slashing operational costs by 60% and obviating the reliance on Control-M.• Successfully upgraded the code base from Spark 1.6 to version 2.3, yielding improved performance and lowering operational costs by 20%. Show less

View
Big Data Developer

Bigdatalabs Oct 2016 - Jul 2018

Coimbatore Area, India

• Developed Spark applications using Core Spark, Spark SQL, Spark DataFrame, Spark Streaming and Spark Structured Streaming.• Done data import and export using Sqoop.• Created Hive (Managed, External, Partitioned and bucketed) tables with different file formats like Avro, Parquet, JSON and Sequence.• Handled large volume of streaming data using Kafka and Spark.• Hands on experience in ELK stack (Elasticsearch, Logstash, Kibana).• Installed Multi-Node Hadoop cluster (Hadoop… Show more • Developed Spark applications using Core Spark, Spark SQL, Spark DataFrame, Spark Streaming and Spark Structured Streaming.• Done data import and export using Sqoop.• Created Hive (Managed, External, Partitioned and bucketed) tables with different file formats like Avro, Parquet, JSON and Sequence.• Handled large volume of streaming data using Kafka and Spark.• Hands on experience in ELK stack (Elasticsearch, Logstash, Kibana).• Installed Multi-Node Hadoop cluster (Hadoop, Spark, Kafka, Elasticsearch, Kibana) in Google Cloud Platform.• Experience in Spark, Hive, Kafka performance tuning.• Hands on experience in NoSQL databases like MongoDB, Cassandra and Neo4j.• Experience in writing Unix Shell Scripts.• Hands on experience with Google Cloud Platform and its services like DataFlow, BigQuery.• Hands on experience in AWS ( EC2, S3, etc). Show less

View
Programmer

Ugam May 2014 - Sep 2016

Coimbatore Area, India

• Developed questionnaire and quota control logics within the IBM SPSS Data Collection tool to facilitate the creation of survey links and capture survey results for market research purposes.• Facilitated the creation of test data for quality assurance assessments on online surveys and addressed queries from the QA team.• Enhanced the interactivity of online surveys and tailored the user interface to align with specific business requirements. • Engaging in client calls to clarify… Show more • Developed questionnaire and quota control logics within the IBM SPSS Data Collection tool to facilitate the creation of survey links and capture survey results for market research purposes.• Facilitated the creation of test data for quality assurance assessments on online surveys and addressed queries from the QA team.• Enhanced the interactivity of online surveys and tailored the user interface to align with specific business requirements. • Engaging in client calls to clarify queries and ensure alignment with project objectives.• Strategizing and prioritizing tasks to meet project timelines and deliverables effectively. Show less

View

Michael Augustin Amalraj Education Details

Park College Of Engineering And Technology

7.6

Frequently Asked Questions about Michael Augustin Amalraj

What is Michael Augustin Amalraj's role at the current company?

Michael Augustin Amalraj's current role is Lead Data Engineer at EPAM Systems | 3 x AWS Certified | CKAD | Hadoop, Spark, Kafka, Hive, ELK Stack.

What schools did Michael Augustin Amalraj attend?

Michael Augustin Amalraj attended Park College Of Engineering And Technology.

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles

Get direct phone numbers & mobile contacts

Access company data & employee information

Works directly on LinkedIn - no copy/paste needed

Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.

Security Check