•
-
Lead Data EngineerEpam Systems Jul 2021 - May 2024Singapore• Managed a team of 3 data engineers in a banking regulatory project, overseeing requirement gathering, architecture design, and big data application development.• Solely developed a robust, distributed ETL framework using PySpark, adeptly handling thousands of intricate business logics configured in MariaDB, resulting in an impressive 70% reduction in development time.• Solely developed a custom scheduler application in Python to streamline the parallel execution of multiple data… Show more • Managed a team of 3 data engineers in a banking regulatory project, overseeing requirement gathering, architecture design, and big data application development.• Solely developed a robust, distributed ETL framework using PySpark, adeptly handling thousands of intricate business logics configured in MariaDB, resulting in an impressive 70% reduction in development time.• Solely developed a custom scheduler application in Python to streamline the parallel execution of multiple data pipelines across 11 sectors, ensuring the maintenance of dependencies between sectors.• Built a sophisticated data validation tool allowing for user-configured checks, guaranteeing data quality at an impeccable 100% accuracy rate.• Independently developed a data compaction utility in Spark to tackle the small file issue, optimizing big data cluster performance and yielding a substantial 40% enhancement in system performance.• Diagnosed the root cause and bottleneck of slow-running existing big data pipelines, then optimized them by rewriting inefficient code and fine-tuning configurations. This initiative led to a remarkable 60% reduction in the end-to-end data pipeline duration.• Contributed to the development of a data migration pipeline, transferring data from an in-house warehouse (ADA platform) to AWS Redshift for enhanced data analytics and visualization. Additionally, provisioned various AWS services including EC2, SQS, S3, Lambda, API Gateway, and Rekognition for a marketing campaign project on the AWS cloud. Show less -
Senior Data EngineerMarina Bay Sands Oct 2019 - Jul 2021Singapore• Crafted specialized Spark applications to meticulously clean, analyze, and generate strategic recommendations for Casino operations, leveraging historical data. This pivotal effort directly contributed to a noteworthy 20% increase in revenue. • Designed, implemented, and tested data flow and integration pipelines in Apache Hadoop using Kafka, Flume and Spark Streaming.• Constructed robust ETL pipelines utilizing Spark, Hadoop, Hive, Impala, and Kudu to seamlessly load data into the… Show more • Crafted specialized Spark applications to meticulously clean, analyze, and generate strategic recommendations for Casino operations, leveraging historical data. This pivotal effort directly contributed to a noteworthy 20% increase in revenue. • Designed, implemented, and tested data flow and integration pipelines in Apache Hadoop using Kafka, Flume and Spark Streaming.• Constructed robust ETL pipelines utilizing Spark, Hadoop, Hive, Impala, and Kudu to seamlessly load data into the data warehouse. These pipelines were instrumental in providing data to fuel machine learning jobs, facilitating credit risk calculation processes.• Streamlined diverse data pipelines through the implementation of shell script automation.• Fine-tuned and optimized the performance of big data applications and pipelines, ensuring optimal throughput and scalability to meet evolving business demands. Show less -
Data EngineerCrédit Agricole Cib Sep 2018 - Oct 2019Singapore• Designed and implemented Spark applications for calculating Market Risk according to FRTB standards, incorporating standard approaches such as DRC, RRAO, and SBM.• Boosted data application performance through optimization, achieving a notable 40% enhancement in efficiency.• Implemented Elasticsearch (ELK) stack pipelines to parse, transform, and load spark application logs in order for BAs and support team to create report and monitor the process.• Conducted thorough analysis to… Show more • Designed and implemented Spark applications for calculating Market Risk according to FRTB standards, incorporating standard approaches such as DRC, RRAO, and SBM.• Boosted data application performance through optimization, achieving a notable 40% enhancement in efficiency.• Implemented Elasticsearch (ELK) stack pipelines to parse, transform, and load spark application logs in order for BAs and support team to create report and monitor the process.• Conducted thorough analysis to identify the root cause of intermittent failures in a critical Spark application persisting for two years. Implemented necessary fixes, thereby ensuring enhanced application reliability and performance in the production environment.• Single-handedly designed and deployed a custom orchestrator application, slashing operational costs by 60% and obviating the reliance on Control-M.• Successfully upgraded the code base from Spark 1.6 to version 2.3, yielding improved performance and lowering operational costs by 20%. Show less -
Big Data DeveloperBigdatalabs Oct 2016 - Jul 2018Coimbatore Area, India• Developed Spark applications using Core Spark, Spark SQL, Spark DataFrame, Spark Streaming and Spark Structured Streaming.• Done data import and export using Sqoop.• Created Hive (Managed, External, Partitioned and bucketed) tables with different file formats like Avro, Parquet, JSON and Sequence.• Handled large volume of streaming data using Kafka and Spark.• Hands on experience in ELK stack (Elasticsearch, Logstash, Kibana).• Installed Multi-Node Hadoop cluster (Hadoop… Show more • Developed Spark applications using Core Spark, Spark SQL, Spark DataFrame, Spark Streaming and Spark Structured Streaming.• Done data import and export using Sqoop.• Created Hive (Managed, External, Partitioned and bucketed) tables with different file formats like Avro, Parquet, JSON and Sequence.• Handled large volume of streaming data using Kafka and Spark.• Hands on experience in ELK stack (Elasticsearch, Logstash, Kibana).• Installed Multi-Node Hadoop cluster (Hadoop, Spark, Kafka, Elasticsearch, Kibana) in Google Cloud Platform.• Experience in Spark, Hive, Kafka performance tuning.• Hands on experience in NoSQL databases like MongoDB, Cassandra and Neo4j.• Experience in writing Unix Shell Scripts.• Hands on experience with Google Cloud Platform and its services like DataFlow, BigQuery.• Hands on experience in AWS ( EC2, S3, etc). Show less -
ProgrammerUgam May 2014 - Sep 2016Coimbatore Area, India• Developed questionnaire and quota control logics within the IBM SPSS Data Collection tool to facilitate the creation of survey links and capture survey results for market research purposes.• Facilitated the creation of test data for quality assurance assessments on online surveys and addressed queries from the QA team.• Enhanced the interactivity of online surveys and tailored the user interface to align with specific business requirements. • Engaging in client calls to clarify… Show more • Developed questionnaire and quota control logics within the IBM SPSS Data Collection tool to facilitate the creation of survey links and capture survey results for market research purposes.• Facilitated the creation of test data for quality assurance assessments on online surveys and addressed queries from the QA team.• Enhanced the interactivity of online surveys and tailored the user interface to align with specific business requirements. • Engaging in client calls to clarify queries and ensure alignment with project objectives.• Strategizing and prioritizing tasks to meet project timelines and deliverables effectively. Show less
Michael Augustin Amalraj Education Details
-
Park College Of Engineering And Technology7.6
Frequently Asked Questions about Michael Augustin Amalraj
What is Michael Augustin Amalraj's role at the current company?
Michael Augustin Amalraj's current role is Lead Data Engineer at EPAM Systems | 3 x AWS Certified | CKAD | Hadoop, Spark, Kafka, Hive, ELK Stack.
What schools did Michael Augustin Amalraj attend?
Michael Augustin Amalraj attended Park College Of Engineering And Technology.
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial