Bilal Qureshi

Bilal Qureshi Email and Phone Number

Senior Big Data Engineer @ New York State Department of State
United States
Bilal Qureshi's Location
United States, United States
About Bilal Qureshi

With over 10 years in IT, I specialize in leveraging advanced big data technologies for solving complex data processing challenges. Throughout my career, I've mastered the entire big data lifecycle: from initial analysis and design through to development, implementation, maintenance, and support.I have extensive experience across the Hadoop ecosystem, including MapReduce, Pig, Hive, HBase, Sqoop, Oozie, Flume, and Spark. My expertise includes designing and implementing algorithms for advanced analytics using Cassandra with Spark and Scala. I excel in creating real-time data visualizations on Hadoop platforms like Platfora, enhancing decision-making with dynamic dashboards.In NoSQL databases, I've developed applications using Cassandra and MongoDB, leveraging custom UDFs for enhanced functionality in Pig and Hive. Proficient in running complex queries with Impala and BI tools on Hadoop clusters, I automate workflows using Oozie and troubleshoot issues across Hadoop components.I bring extensive cloud experience with Cloudera, AWS (EC2, S3, Glue, Lambda, RedShift), Microsoft Azure (Data Lake Analytics, Data Factory, Databricks), and Hortonworks. My specialties include Spark architecture, Structured Streaming, and scalable solutions with Apache Solr and Kafka for real-time data processing.My technical toolkit includes Java, Scala, Python, SQL, and PL/SQL, along with expertise in Servlets, JSP, Struts, Spring, Hibernate, and RESTful web services. I develop machine learning models using Python and scikit-learn and excel in RESTful web services.Known for collaborative skills, I thrive in cross-functional teams, delivering solutions by converting Hive/SQL queries into Spark transformations with DataFrames and Scala, and integrating Kafka for stream processing.Throughout my career, I've prioritized continuous learning and adapting to emerging technologies, ensuring I stay at the forefront. My goal is to leverage my skills and experience to drive impactful solutions aligned with business objectives.

Bilal Qureshi's Current Company Details
New York State Department of State

New York State Department Of State

View
Senior Big Data Engineer
United States
Website:
parks.ny.gov
Employees:
1083
Bilal Qureshi Work Experience Details
  • New York State Department Of State
    Senior Big Data Engineer
    New York State Department Of State
    United States
  • New York State Department Of State
    Senior Big Data Engineer
    New York State Department Of State Jun 2023 - Present
    New York, New York, United States
    - Involved in analyzing business requirements and prepared detailed specifications that follow project guidelines required for project development.- Extensively used Apache Kafka, Apache Spark, HDFS and Apache Impala to build a near real time data pipelines that get, transform, store and analyze click stream data to provide a better personalized user experience.- Primarily involved in Data Migration using SQL, SQL Azure, Azure Storage, and Azure Data Factory, … Show more - Involved in analyzing business requirements and prepared detailed specifications that follow project guidelines required for project development.- Extensively used Apache Kafka, Apache Spark, HDFS and Apache Impala to build a near real time data pipelines that get, transform, store and analyze click stream data to provide a better personalized user experience.- Primarily involved in Data Migration using SQL, SQL Azure, Azure Storage, and Azure Data Factory, SSIS, PowerShell.- Designed and implemented configurable data delivery pipeline for scheduled updates to customer facing data stores built with Python- Proficient in Machine Learning techniques (Decision Trees, Linear/Logistic Regressors) and Statistical Modeling - Implement medium to large scale BI solutions on Azure using Azure Data Platform services (Azure Data Lake, Data Factory, Data Lake Analytics, Stream Analytics, Azure SQL DW, HDInsight/Databricks, NoSQL DB).- Performed data extraction, transformation, loading, and integration in data warehouse, operational data stores and master data management- Experienced in ETL concepts, building ETL solutions and Data modeling - Worked on architecting the ETL transformation layers and writing spark jobs to do the processing.- Aggregated daily sales team updates to send report to executives and to organize jobs running on Spark clusters- Optimized the Tensor Flow Model for efficiency- Used Pyspark for data frames, ETL, Data Mapping, Transformation and Loading in complex and high- volume environment- Implemented Apache Airflow for authoring, scheduling and monitoring Data Pipelines- Create Spark code to process streaming data from Kafka cluster and load the data to staging area for processing.- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team Using Tableau. Show less
  • Empower
    Big Data Engineer
    Empower Jan 2022 - May 2023
    Greenwood Village, Colorado, United States
    - Worked on Ingesting data by going through cleansing and transformations and leveraging AWS Lambda, AWS Glue and Step Functions- Created monitors, alarms, notifications and logs for Lambda functions, Glue Jobs, EC2 hosts using Cloudwatch.- Worked on EMR clusters of AWS for processing Big Data across a Hadoop Cluster of virtual servers.- Developed various Mappings with the collection of all Sources, Targets, and Transformations using Informatica Designer-… Show more - Worked on Ingesting data by going through cleansing and transformations and leveraging AWS Lambda, AWS Glue and Step Functions- Created monitors, alarms, notifications and logs for Lambda functions, Glue Jobs, EC2 hosts using Cloudwatch.- Worked on EMR clusters of AWS for processing Big Data across a Hadoop Cluster of virtual servers.- Developed various Mappings with the collection of all Sources, Targets, and Transformations using Informatica Designer- Developed a python script to transfer data, REST API’s and extract data from on-premises to AWS S3. - - Implemented Micro Services based Cloud Architecture using Spring Boot.- Worked on Docker containers snapshots, attaching to a running container, removing images, managing - Directory structures and managing containers.- Collected data using Spark Streaming from AWS S3 bucket in near-real-time and performs necessary - - Transformations and Aggregation on the fly to build the common learner data model and persists the data in HDFS.- Used Apache NiFi to copy data from local file system to HDP.- Designed both 3NF Data models and dimensional Data models using Star and Snowflake Schemas - - -- Handling message streaming data through Kafka to s3.- Implementing python script for creating the AWS Cloud Formation template to build EMR cluster with instance types.- Successfully loading files to Hive and HDFS from Oracle, SQL Server using SQOOP.- Developed highly complex Python and Scala code, which is maintainable, easy to use, and satisfies application requirements, data processing and analytics using inbuilt libraries.- Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, PostgreSQL, Data Frame, Open Shift, Talend, pair RDD's.- Experience with deploying Hadoop in a VM and AWS Cloud as well as physical server environment- Monitor Hadoop cluster connectivity and security and File system management. Show less
  • Dillard'S Inc.
    Data Engineer
    Dillard'S Inc. May 2020 - Dec 2021
    Little Rock, Arkansas, United States
    - Designing and building multi-terabyte, full end-to-end Data Warehouse infrastructure from the ground up on Confidential Redshift for large scale data handling Millions of records every day- Developed SSRS reports, SSIS packages to Extract, Transform and Load data from various source systems- Having experience in developing a data pipeline using Kafka to store data into HDFS.- Worked on Big data on AWS cloud services i.e. EC2, S3, EMR and DynamoDB- Created Entity… Show more - Designing and building multi-terabyte, full end-to-end Data Warehouse infrastructure from the ground up on Confidential Redshift for large scale data handling Millions of records every day- Developed SSRS reports, SSIS packages to Extract, Transform and Load data from various source systems- Having experience in developing a data pipeline using Kafka to store data into HDFS.- Worked on Big data on AWS cloud services i.e. EC2, S3, EMR and DynamoDB- Created Entity Relationship Diagrams (ERD), Functional diagrams, Data flow diagrams and enforced referential integrity constraints and created logical and physical models using Erwin.- Created ad hoc queries and reports to support business decisions SQL Server Reporting Services (SSRS).- Strong understanding of AWS components such as EC2 and S3- Used Hive SQL, Presto SQL and Spark SQL for ETL jobs and using the right technology for the job to get done.- Measured Efficiency of Hadoop/Hive environment ensuring SLA is met- Optimized the Tensor Flow Model for efficiency- Analyzed the system for new enhancements/functionalities and perform Impact analysis of the application for implementing ETL changes- Managed security groups on AWS, focusing on high-availability, fault-tolerance, and auto scaling using - Terraform templates. Along with Continuous Integration and Continuous Deployment with AWS Lambda and AWS code pipeline.- Involved in the Forward Engineering of the logical models to generate the physical model using Erwin and generate Data Models using ERwin and subsequent deployment to Enterprise Data Warehouse.- Wrote various data normalization jobs for new data ingested into Redshift.- Defined facts, dimensions and designed the data marts using the Ralph Kimball's Dimensional Data Mart modeling methodology using Erwin.- Created various complex SSIS/ETL packages to Extract, Transform and Load data Show less
  • Broadridge
    Data Engineer
    Broadridge Jan 2018 - Apr 2020
    New York, New York, United States
    - Created HBase tables to load large sets of structured data. - Managed and reviewed Hadoop log files. - Used AWS Glue for the data transformation, validate and data cleansing.- Used Sqoop widely in order to import data from various systems/sources (like MySQL) into HDFS.- Created components like Hive UDFs for missing functionality in HIVE for analytics.- Developing Scripts and Batch Job to schedule a bundle (group of coordinators) which consists of various.- Used… Show more - Created HBase tables to load large sets of structured data. - Managed and reviewed Hadoop log files. - Used AWS Glue for the data transformation, validate and data cleansing.- Used Sqoop widely in order to import data from various systems/sources (like MySQL) into HDFS.- Created components like Hive UDFs for missing functionality in HIVE for analytics.- Developing Scripts and Batch Job to schedule a bundle (group of coordinators) which consists of various.- Used different file formats like Text files, Sequence Files, Avro.- Cluster co-ordination services through Zookeeper.- Worked extensively with HIVE DDLs and Hive Query language (HQLs).- Analyzed the data using Map Reduce, Pig, Hive and produce summary results from Hadoop to downstream systems.- Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.- Developed data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.- Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.- Exported the analyzed data to the relational database MySQL using Sqoop for visualization and to generate reports.- Developed UDF, UDAF, UDTF functions and implemented it in HIVE Queries.- Implemented SQOOP for large dataset transfer between Hadoop and RDBMs.- Processed data into HDFS by developing solutions.- Created Map Reduce Jobs to convert the periodic of XML messages into a partition avro Data.- Assisted in creating and maintaining Technical documentation to launching HADOOP Clusters and even for executing Hive queries and Pig Scripts. Show less
  • Paz Technologies
    Data Engineer
    Paz Technologies Feb 2014 - Sep 2017
    Karāchi, Sindh, Pakistan
    - Loaded data from MySQL server to the Hadoop clusters using the data ingestion tool Sqoop.- Extensively worked with PySpark / Spark SQL for data cleansing and generating Data Frames and RDDs.- Involved in creating Hive tables, loading with data and writing hive queries on top of data present in HDFS.- Worked on tuning the performance Pig queries. Involved in Developing the Pig scripts for processing data.- Written Hive queries to transform the data into tabular… Show more - Loaded data from MySQL server to the Hadoop clusters using the data ingestion tool Sqoop.- Extensively worked with PySpark / Spark SQL for data cleansing and generating Data Frames and RDDs.- Involved in creating Hive tables, loading with data and writing hive queries on top of data present in HDFS.- Worked on tuning the performance Pig queries. Involved in Developing the Pig scripts for processing data.- Written Hive queries to transform the data into tabular format and process the results using Hive Query Language.- By using Apache Flume loaded real time unstructured data like xml data, log files into HDFS.- Processed large amount both structured and unstructured data using MapReduce framework.- Designed solution to perform ETL tasks like data acquisition, data transformation, data cleaning and efficient data storage on HDFS- Developed Spark code using Scala and Spark Streaming for faster testing and processing of data.- Store the resultant processed data back into Hadoop Distributed File System.- Applied machine learning algorithms (K- nearest Neighbors, random forest) using Spark MLib on top of HDFS data and compare the accuracy between the models.- Used Tableau to get the visualizations on data outcome from the ML algorithms. Show less

Bilal Qureshi Education Details

Frequently Asked Questions about Bilal Qureshi

What company does Bilal Qureshi work for?

Bilal Qureshi works for New York State Department Of State

What is Bilal Qureshi's role at the current company?

Bilal Qureshi's current role is Senior Big Data Engineer.

What schools did Bilal Qureshi attend?

Bilal Qureshi attended Sir Syed University Of Engineering & Technology (Ssuet).

Not the Bilal Qureshi you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.