Muhammad Q

Muhammad Q Email and Phone Number

Sr. Data Engineer at Capital One with Lead - AWS | Hadoop | Big Data | Snowflake | Talend | Ab Initio | Spark | PySpark | Azure | ADF | SQL | Python | Scala | Java | Tableau | Git | Jenkins | Kafka | MongoDB | MySQL @ Capital One
mclean, virginia, united states
Muhammad Q's Location
United States, United States
About Muhammad Q

With over a decade of hands-on experience as a Senior Data Engineer, I specialize in architecting data-intensive applications that unlock actionable insights and drive business value. My expertise spans a broad spectrum of technologies, including the Hadoop Ecosystem, Big Data Analytics, Snowflake, Talend, Spark, PySpark, Ab Initio, and DataStage, among others.Throughout my career, I've demonstrated a consistent ability to design and implement robust data solutions on AWS and Azure cloud platforms. I have a proven track record of developing and optimizing data pipelines, storage systems, and warehousing solutions, leveraging tools such as S3, EMR, Redshift, Glue, Azure Data Factory, and Synapse.As a certified AWS Solution Architect with a Master's Degree in Computer Science, I bring a holistic understanding of cloud architecture and data engineering principles to every project. Proficient in data cleansing and preprocessing using Python, Alteryx, and Tableau, I ensure data accuracy and consistency at every stage of the data lifecycle. My expertise extends to various databases, including Cassandra, MongoDB, Oracle, MySQL, and SQL Server, enabling me to handle diverse data sources seamlessly.I've led initiatives to optimize data infrastructure, enhance governance practices, and develop advanced analytics solutions to support strategic decision-making. Notably, at Capital One, I spearheaded the implementation of an Enterprise Data Lake, enabling diverse analytics and reporting use cases.Passionate about staying abreast of technological advancements, I continuously expand my skill set to tackle evolving challenges in data engineering. I am eager to apply my expertise to drive impactful solutions and contribute to the success of forward-thinking organizations.Let's connect to explore opportunities for collaboration and innovation in the realm of data engineering!

Muhammad Q's Current Company Details
Capital One

Capital One

View
Sr. Data Engineer at Capital One with Lead - AWS | Hadoop | Big Data | Snowflake | Talend | Ab Initio | Spark | PySpark | Azure | ADF | SQL | Python | Scala | Java | Tableau | Git | Jenkins | Kafka | MongoDB | MySQL
mclean, virginia, united states
Website:
capitalone.com
Employees:
55043
Muhammad Q Work Experience Details
  • Capital One
    Sr. Data Engineer / Data Developer
    Capital One Jan 2023 - Present
    Virginia, United States
    Designed and implemented Enterprise Data Lake for diverse use cases, including Analytics, processing, storing, and reporting of large, dynamic datasets.Developed ETL solutions using Spark SQL in Databricks for data extraction, transformation, and aggregation from various sources, utilizing multiple file formats.Practical understanding of the Data modeling (Dimensional & Relational) concepts like Star-Schema, Snowflake Schema, and One Big Table.Utilized Athena to execute queries on… Show more Designed and implemented Enterprise Data Lake for diverse use cases, including Analytics, processing, storing, and reporting of large, dynamic datasets.Developed ETL solutions using Spark SQL in Databricks for data extraction, transformation, and aggregation from various sources, utilizing multiple file formats.Practical understanding of the Data modeling (Dimensional & Relational) concepts like Star-Schema, Snowflake Schema, and One Big Table.Utilized Athena to execute queries on Glue ETL processed data, generating reports through Quick Sight for Business Intelligence.Supported the migration of data from SQL Server to Snowflake for better storage cost and performance.Utilized AWS EMR for transforming and moving large datasets into and out of other AWS data stores.Used Matillion to integrate data from multiple sources like Files, SQL databases, NoSQL Databases, and API’s along with transformed the data and server the data for downstream BI applications.Integrated Terraform with CI/CD pipelines to automate infrastructure provisioning.Performed unit testing, integration testing, and web application testing using Pytest.Experience in AWS cloud infrastructure database migrations, PostgreSQL and converting existing ORACLE and MS SQL Server databases to PostgreSQL, MySQL, and Aurora.Used DBT cloud for managing and orchestrating DBT runs, ensuring reliable and scalable data transformation processes.Developed and maintained Data pipelines using Azure Data Factory and Azure Databricks.Configured the DBT cloud project and wrote DBT models to transform the source data and loaded them into target tables.Hands-on experience in Data Analytics services such as Athena, Glue, Data Catalog, and Quick Sight.Designed and built multi-terabyte Data Warehouse infrastructure on Redshift.Monitored the performance of Kafka clusters and scaled resources on AWS and GCP as needed to handle increasing data loads. Show less
  • Citi
    Sr. Data Engineer / Developer
    Citi Feb 2021 - Nov 2022
    Tampa, Florida, United States
    Design and Implementation of Data Lake in AWS S3 using different Amazon web services such as Kinesis, S3, Lambda, EMR, Athena and Quick Sight.Developed and maintained Terraform code as infrastructure. Developed modules that enables to provide infrastructure as building blocks that other teams can leverage.Used AWS boto3 library for python to create, configure and manage AWS services.Built Serverless API’s using AWS Lambda and python.Experience in developing data processing tasks… Show more Design and Implementation of Data Lake in AWS S3 using different Amazon web services such as Kinesis, S3, Lambda, EMR, Athena and Quick Sight.Developed and maintained Terraform code as infrastructure. Developed modules that enables to provide infrastructure as building blocks that other teams can leverage.Used AWS boto3 library for python to create, configure and manage AWS services.Built Serverless API’s using AWS Lambda and python.Experience in developing data processing tasks using PySpark such as reading data from external sources, merging data, performing data enrichment, and loading in to target data destinations.Worked on submitting the Spark jobs using PySpark which shows the metrics of the data, this is used for Data Quality Checking.Used COPY/INSERT, PUT, GET commands for bulk loading data into snowflake tables from internal and external stages.Extract and transform data from Excel to help with the data migrations and making mass changes in SAP.Loaded incremental data using Snowflake streams (CDC) and created tasks to move data into target table on scheduled interval.Used AWS Glue to create, run, monitor ETL workflows enrich, clean, normalize data combine and replicate data across different data stores.Extract data from multiple systems, conduct detailed data analysis, and load data into SAP-MDMHands on experience with Snowflake utilities such as Snow SQL, Snow Pipe, and built and deployed data model using DBT and Git.Supported the migration of data from SQL server to snowflake with Ab Initio as the ETL tool.Created workflows using Ab Initio Flows to move both real time and batch data and utilized Ab Initio Metadata Hub to manage and maintain metadata.Wrote the transformation models in DBT along with Airflow (Python) to build end-to-end pipeline to process data.Setting up AWS SNS e-mails for daily job status notification for the stakeholders. Show less
  • Experian
    Data Engineer
    Experian Mar 2019 - Jan 2021
    Illinois, United States
    Developed and implemented scalable and efficient ETL data pipeline within AWS cloud using services like S3, Glue, Kinesis, Lambda, Step function, EMR, SNS, SQS.Designed and implemented data models and data warehouse solutions using AWS services such as Athena and Redshift.Built the data pipeline to extract the SQL server data and merge with transactional data sitting on S3 bucket. Built and manage streaming data pipelines using AWS Kinesis and Apache Kafka.Developed and… Show more Developed and implemented scalable and efficient ETL data pipeline within AWS cloud using services like S3, Glue, Kinesis, Lambda, Step function, EMR, SNS, SQS.Designed and implemented data models and data warehouse solutions using AWS services such as Athena and Redshift.Built the data pipeline to extract the SQL server data and merge with transactional data sitting on S3 bucket. Built and manage streaming data pipelines using AWS Kinesis and Apache Kafka.Developed and maintained data ingestion pipelines for various data sources like web logs, application logs, and clickstream data.Developed and maintained data ingestion pipelines for various data sources like web logs, application logs, and clickstream data.Developed Procedures and CDS table functions to implement code to data paradigm to benefit from capabilities of SAP Hana.Experience with designing the data warehouse on Redshift and building the star-schema (Fact-Dimension tables) data model for clickstream data.Managed AWS resources and infrastructure using AWS CloudFormation.Implemented data backup and archiving policies using AWS S3 and Glacier.Configured SAP modules to align with billing requirements, including customizing data fields and setting up user roles and permissions.Migrated the data from SQL server to Amazon RDS with Amazon Database migration service (DMS) with minimal downtime.Experience working with DynamoDB database to store log files and worked with Geospatial data for customer segmentation.Responsible for building alert notification systems leveraging AWS services (SNS) to send automated emails on load failures. Show less
  • Johnson & Johnson
    Data Engineer
    Johnson & Johnson Apr 2017 - Feb 2019
    New Jersey, United States
    Designing and implementing end-to-end real-time data processing pipelines using Azure services such as Azure Stream Analytics, Azure Event Hubs, Azure Synapse, and Azure Functions.Created ADF pipeline to load data from on-prem to Azure SQL Server database and Azure Data Lake Storage.Worked on building data pipeline using Azure Data Factory and Azure Databricks, loading data to Azure Data Lake,Developed, and maintained ETL processing pipeline to migrate data to Azure Synapse for… Show more Designing and implementing end-to-end real-time data processing pipelines using Azure services such as Azure Stream Analytics, Azure Event Hubs, Azure Synapse, and Azure Functions.Created ADF pipeline to load data from on-prem to Azure SQL Server database and Azure Data Lake Storage.Worked on building data pipeline using Azure Data Factory and Azure Databricks, loading data to Azure Data Lake,Developed, and maintained ETL processing pipeline to migrate data to Azure Synapse for analysis.Used Azure data lake analytics, HDInsight/Databricks to generate Ad hoc reports.Worked on all aspects of data mining, data collection, data cleaning, model development, data validation, and data visualization.Configure Azure API management to securely expose and manage API for accessing and consuming data from internal data sources as Azure SQL DB, Azure Blob Storage, and Azure Data Lake.Monitor real time events mainly drug reaction, regulatory updates and protocol deviations and triggers automated workflows based on predetermined conditions and thresholds using Azure Logic Apps.Supported the Migration of data from SQL Server to Azure SQL Database and Azure Blob Storage. Written the Stored procedure in SQL Server along with Dynamic SQL to process records based on input parameters.Designed and developed Business intelligence solutions using SSIS, SSRS, and SSAS.Managed the Kafka cluster and created topics to store the data for different events.Created and provisioned numerous Databricks cluster needed for batch and continuous streaming data processing and installed libraries for the clusters.In-depth experience with integrating the Azure services and Integrated Synapse with Power BI for creating visualizations.Utilized Azure DevOps to maintain applications and built pipelines for streamlining machine learning models into Azure dev/test/prod environments. Show less
  • Target
    Data Engineer
    Target Apr 2015 - Mar 2017
    Minnesota, United States
    Designed and developed Bigdata analytic solutions and engaged clients in technical discussions.Worked on multiple Azure platforms like Azure Data Factory, Azure Synapse, Azure Data Lake, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight.Worked on the creation and implementation of custom Hadoop applications in the Azure environment.Created ADF Pipelines to load data from an on-prem to Azure SQL Server database and Azure Data Lake storage.Used Azure… Show more Designed and developed Bigdata analytic solutions and engaged clients in technical discussions.Worked on multiple Azure platforms like Azure Data Factory, Azure Synapse, Azure Data Lake, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight.Worked on the creation and implementation of custom Hadoop applications in the Azure environment.Created ADF Pipelines to load data from an on-prem to Azure SQL Server database and Azure Data Lake storage.Used Azure Data Lake Analytics, HDInsight/Databricks to generate Ad Hoc analysis.Developed custom ETL solutions, batch processing, and real-time data ingestion pipeline to move data in and out of Hadoop using PySpark and shell scripting. Data Ingestion to at least one Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks. Experienced in managing Azure Data Lake Storage (ADLS), Databricks Delta Lake and an understanding of how to integrate with other Azure Services. Worked on building data pipelines using Azure Data Factory, Azure Databricks, loading data to Azure Data Lake.Used Zeppelin, Jupyter notebooks, and Spark-Shell to develop, test, and analyze Spark jobs before Scheduling Customized Spark jobs.Worked with Azure BLOB and Data Lake storage and loading data into Azure SQL Synapse analytics (DW).Performing hive tuning techniques like partitioning, bucketing, and memory optimization.Using Databricks utilities called widgets to pass parameters on run time from ADF to Data bricks.Integrated data storage options with Spark, notably with Azure Data Lake Storage and Blob storage. Created an Oozie workflow to automate the process of loading data into HDFS and Hive.Created, provisioned numerous Databricks clusters needed for batch and continuous streaming data processing and installed the required libraries for the clusters. Show less
  • Blue Cross Blue Shield
    Etl Developer
    Blue Cross Blue Shield Jan 2014 - Mar 2015
    New Jersey, United States
    Contributed significantly to Big Data Integration and Analytics initiatives, leveraging advanced technologies such as Hadoop, Spark, Spark SQL, and NoSQL Database.Developed PySpark (Python) Scripts to preprocess and transform the data.Collaboratively participated in the meticulous maintenance of clusters, including tasks such as adding and removing cluster nodes, continuous cluster monitoring, troubleshooting, and overseeing data backups and log files. Proficiently utilized Oozie for… Show more Contributed significantly to Big Data Integration and Analytics initiatives, leveraging advanced technologies such as Hadoop, Spark, Spark SQL, and NoSQL Database.Developed PySpark (Python) Scripts to preprocess and transform the data.Collaboratively participated in the meticulous maintenance of clusters, including tasks such as adding and removing cluster nodes, continuous cluster monitoring, troubleshooting, and overseeing data backups and log files. Proficiently utilized Oozie for seamless workflow orchestration.Played a crucial role in the seamless loading of data from the LINUX file system to HDFS, demonstrating expertise in data migration concepts within the Hadoop ecosystem.Wrote Shell and Bash scripts to process the file and clean the data.Conducted in-depth data analysis using Hadoop components Hive and Pig, crafting sophisticated Hive queries that empowered market analysts to identify emerging trends through the comparison of fresh data with EDW reference tables and historical metrics.Executed critical tasks in importing data to HDFS using Sqoop from diverse RDBMS servers and facilitated data export using Sqoop to the RDBMS servers after aggregations for other ETL operations.Demonstrated proficiency in HBase by developing Java-based client programs and web services, contributing to the efficient movement of data from Oracle and MSSQL Server into HDFS using Sqoop.Provided valuable support to analysts and the test team for writing effective Hive queries, and actively contributed to the implementation of test scripts supporting test-driven development and continuous integration.Played an active role in the installation of operating systems, Hadoop updates, patches, and version upgrades in collaboration with application teams. Showcased hands-on expertise in Hadoop cluster management, MapReduce jobs, and data migration concepts in Hive. Show less

Muhammad Q Education Details

Frequently Asked Questions about Muhammad Q

What company does Muhammad Q work for?

Muhammad Q works for Capital One

What is Muhammad Q's role at the current company?

Muhammad Q's current role is Sr. Data Engineer at Capital One with Lead - AWS | Hadoop | Big Data | Snowflake | Talend | Ab Initio | Spark | PySpark | Azure | ADF | SQL | Python | Scala | Java | Tableau | Git | Jenkins | Kafka | MongoDB | MySQL.

What schools did Muhammad Q attend?

Muhammad Q attended New Jersey Institute Of Technology.

Who are Muhammad Q's colleagues?

Muhammad Q's colleagues are Oliver Ridgeway, Narris Robin, Elaine J Morley, Rama Reddy, Christopher Durgin, Taylor Chapman, Thomas Miller.

Not the Muhammad Q you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.