Md Hassan

Md Hassan Email and Phone Number

Senior Data Engineer @ Capital One
Lansdale, PA, US
Md Hassan's Location
Lansdale, Pennsylvania, United States, United States
About Md Hassan

Over 10+ years of diversified experience in Software Design & Development. Experience as Big Data Engineer solving business use cases for several clients. Experience in the field of software with expertise in backend applications.

Md Hassan's Current Company Details
Capital One

Capital One

View
Senior Data Engineer
Lansdale, PA, US
Website:
capitalone.com
Employees:
63917
Md Hassan Work Experience Details
  • Capital One
    Senior Data Engineer
    Capital One
    Lansdale, Pa, Us
  • Capital One
    Big Data Engineer
    Capital One Oct 2022 - Present
    Mclean, Va, Us
    Involved in designing and deploying multi-tier applications using all the AWS services like (EC2,Route53, S3, RDS, DynamoDB, SNS, SQS, IAM) focusing on high-availability, fault tolerance, and auto-scaling in AWS Cloud Formation• Supporting Continuous storage in AWS using Elastic Block Storage, 53, Glacier. Created Volumes andconfigured Snapshots for EC2 instances O Used Data frame API in Scala for converting the distributedcollection of data organized into named columns, developing• predictive analytic using Apache Spark Scala APIs• Developed Scala scripts using both Data frames/SQL/Data sets and RDD/MapReduce in Spark for DataAggregation, queries and writing data back into OLTP system through Sqoop• Developed Hive queries to pre-process the data required for running the business process• Created HBase tables to load large sets of structured, semi-structured and unstructured data comingfrom UNIX, NoSQL and a variety of portfolios• Implementations of generalized solution model using AWS Sage Maker• Extensive expertise using the core Spark APIs and processing data on an EMR cluster• Worked on ETL Migration services by developing and deploying AWS Lambda functions for generating aserverless data pipeline which is writeable to the glue catalog and can be queried from Athena.• Programmed in Hive, Spark SQL, Java, C# and Python to streamline the incoming data and build thedata pipelines to get the useful insights, and orchestrated pipelines• Extensive expertise using the core Spark APIs and processing data on a EMR cluster• Worked on ETL pipeline to source these tables and to deliver this calculated ratio data from AWS toDatamart (SQL Server) & Credit Edge server• Experience in using and tuning relational databases (e.g., Microsoft SQL Server, Oracle, MySQL) and columnardatabases (e.g., Amazon Redshift. Microsoft SQL Data Warehouse)
  • Lendingtree
    Senior Big Data Engineer
    Lendingtree Feb 2021 - Oct 2022
    Charlotte, Nc, Us
    Experience using Impala for data processing on top of HIVE for better utilization. Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to DynamoDB using Scala. Developed Spark code using Scala and Spark-SQL for faster processing and testing. Worked on Spark SQL for joining multi hive tables and write them to a final hive table and stored them on S3. Created Spark jobs to do lighting speed analytics over the spark cluster. Evaluated Spark's performance vs Impala on transactional data. Used Spark transformations and aggregations to perform min, max and average on transactional data. Data sources are extracted, transformed, and loaded to generate CSV data files with Python programming and SQL queries. Implemented Spark RDD transformations to Map business analysis and apply actions on top of transformations. Wrote various SQL, PLSQL queries and stored procedures for data retrieval. Design ETL using Internal/External tables and store in parquet format for efficiency. Configured Spark streaming to get ongoing information from the Kafka and store the stream information to AWS. Develop, Deploy and Troubleshoot the ETL Workflows using Hive, Pig and Sqoop. Optimized Hive QL/ pig scripts by using execution engine like Tez, Spark. Developed end to end data processing pipelines that begin with receiving data using distributed messaging systems Kafka for persisting data into Cassandra. Experienced in migrating Hive QL into Impala to minimize query response time. Collected data using Spark Streaming from AWSS3 bucket in near-real- time and performs necessary Transformations and Aggregations to build the data model and persists the data in HDFS. Fetch and generate monthly reports, Visualization of those reports using Tableau. Used Oozie Workflow engine to run multiple Hive and Pig jobs. Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
  • Merck Pharma
    Big Data Engineer
    Merck Pharma Nov 2019 - Jan 2021
    Writing a Data Bricks code and ADF pipeline with fully parameterized for efficient code management. Created and maintained SQL Server scheduled jobs, executing stored procedures for the purpose of extracting data from Oracle into SQL Server. Extensively used Tableau for customer marketing data visualization Created Power BI reports and upgraded power pivot reports to Power BI. Developed a detailed project plan and helped manage the data conversion migration from the legacy system to the target snowflake database. Transforming business problems into Big Data solutions and define Big Data strategy and Roadmap. Installing, configuring, and maintaining Data Pipelines Developed Data Bricks Python notebooks to Join, filter, pre-aggregate, and process the files stored in Azure data lake storage. Utilized Power Query in Power BI to Pivot and Un-pivot the data model for data cleansing and data massaging.
  • Amd
    Big Data Engineer
    Amd Sep 2017 - Nov 2019
    Santa Clara, California, Us
    Implementing and Managing ETL solutions and automating operational processes. Optimizing and tuning the Redshift environment, enabling queries to perform up to 100x faster for Tableau and SAS Visual Analytics. Advanced knowledge on Confidential Redshift and MPP database concepts. Migrated on premise database structure to Confidential Redshift data warehouse Defined facts, dimensions and designed the data marts using the Ralph Kimball's Dimensional Data Mart modelling methodology using Erwin Strong understanding of AWS components such as EC2 and S3 Implemented a Continuous Delivery pipeline with Docker, and Git Hub and AWS Built performant, scalable ETL processes to load, cleanse and validate data Participated in the full software development lifecycle with requirements, solution design, development, QA implementation, and product support using Scrum and other Agile methodologies Compiled data from various sources to perform complex analysis for actionable results Measured Efficiency of Hadoop/Hive environment ensuring SLA met. Worked publishing interactive data visualizations dashboards, reports /workbooks on Tableau and SAS Visual Analytics. Worked on Big data on AWS cloud services i.e., EC2, S3, EMR and DynamoDB
  • Apex Land Clearing & Development, Llc
    Data Engineer
    Apex Land Clearing & Development, Llc Aug 2015 - Aug 2017
    Participated in requirements sessions to gather requirements along with business analysts and product owners. Involved in Kafka and building use case relevant to our environment. Worked on implementation and maintenance of Cloudera Hadoop cluster. Pulling the data from data lake (HDFS) and massaging the data with various RDD transformations. Involved in building an information pipeline and performed analysis utilizing AWS stack (EMR, EC2, S3, RDS, Lambda, Glue, SQS, and Redshift). Developed data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS. Developed Oozie workflow jobs to execute hive, Sqoop and MapReduce actions. Architected, Designed and Developed Business applications and Data marts for reporting. Imported the data from various sources like HDFS/HBase into Spark RDD and developed a data pipeline using Kafka and Storm to store data into HDFS. Collaborated with Business users for requirement gathering for building Tableau reports per business needs. Developed Pig Latin scripts for replacing the existing legacy process to the Hadoop and the data is fed to AWS S3.

Md Hassan Education Details

  • Westwood College
    Westwood College
    Computer/Information Technology Administration And Management

Frequently Asked Questions about Md Hassan

What company does Md Hassan work for?

Md Hassan works for Capital One

What is Md Hassan's role at the current company?

Md Hassan's current role is Senior Data Engineer.

What schools did Md Hassan attend?

Md Hassan attended Westwood College.

Who are Md Hassan's colleagues?

Md Hassan's colleagues are Tawanna Parker, Lei Qu, Ian Joseph, Alex Pickart, Brian Groom Sr., Vanesa Perez, Danny Jeon.

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.