Sadia T

Sadia T Email and Phone Number

Big Data Engineer at Cummins Bridgeway Columbus LLC @ Cummins Inc.
Sadia T's Location
Richardson, Texas, United States, United States
About Sadia T

Sadia T is a Big Data Engineer at Cummins Bridgeway Columbus LLC at Cummins Inc..

Sadia T's Current Company Details
Cummins Inc.

Cummins Inc.

View
Big Data Engineer at Cummins Bridgeway Columbus LLC
Sadia T Work Experience Details
  • Cummins Inc.
    Senior Data Engineer
    Cummins Inc. Aug 2020 - Present
    Columbus, Indiana, Us
    • Architected, Designed and Developed Business applications and Data marts for reporting. Involved in different phases of Development life including Analysis, Design, Coding, Unit Testing, Integration Testing, Review and Release as per the business requirements.• Implemented Spark GraphX application to analyze guest behavior for data science segments.• Worked on batch processing of data sources using Apache Spark , Elastic search.• Developed Big Data solutions focused on pattern matching and predictive modeling.• Collaborated with EDW team in, High Level design documents for extract, transform, validate and load ETL process data dictionaries, Metadata descriptions, file layouts and flow diagrams.• Develop an Estimation model for various product & services bundled offering to optimize and predict the gross margin• Designed OLTP system environment and maintained documentation of Metadata. Used forward engineering approach for designing and creating databases for OLAP modelENVIRONMENT: IBM DataStage, Python, Spark framework, AWS, Redshift, MS Excel, NoSQL, Tableau, T-SQL, ETL, RNN, LSTM MS Access, XML, MS office 2007, Outlook, MS SQL Server.
  • Broadridge
    Senior Data Engineer
    Broadridge Jan 2019 - Jul 2020
    New York, New York, Us
    • Developing Spark programs with Python, and applied principles of functional programming to process the complex structured data sets.• Work in a fast-paced agile development environment to quickly analyze, develop, and test potential use cases for the business.• The individual will be responsible for design and development of High-performance data architectures which support data warehousing, real-time ETL and batch big-data processing.• This project was mainly focus on reporting the commercial loan detailed information to ‘Federal department’ with applying ‘Data Governance controls on it.• Worked with Hadoop infrastructure to storage data in HDFS storage and use Spark / HIVE SQL to migrate underlying SQL codebase in AWS.• Converting Hive/SQL queries into Spark transformations using Spark RDDs and Pyspark• Analyzing SQL scripts and designed the solution to implement using PySpark• Export tables from Teradata to HDFS using Sqoop and build tables in Hive.• Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.• Use SparkSQL to load JSON data and create Schema RDD and loaded it into Hive Tables and handled structured data using SparkSQL.• Worked with Hadoop ecosystem and Implemented Spark using Scala and utilized DataFrames and Spark SQL API for faster processing of data.ENVIRONMENT: IBM Info sphere DataStage 9.1/11.5, Oracle 11g, Flat files, Autosys, UNIX, Erwin, TOAD, MS SQL Server database, XML files, AWS, MS Access database.
  • Homesite Insurance
    Big Data Engineer
    Homesite Insurance Jul 2016 - Dec 2018
    Boston, Ma, Us
    • Responsible for analyzing large data sets to develop multiple custom models and algorithms to drive innovative business solutions.• Involved in designing data warehouses and data lakes on regular (Oracle, SQL Server) high performance on big data (Hadoop - Hive and HBase) databases. Data modeling, Design, implement, and deploy high-performance, custom applications at scale on Hadoop /Spark.• Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy DB2 and SQL Server database systems.• Translated business requirements into working logical and physical data models for OLTP &OLAP systems.• Creation of BTEQ, Fast export, Multi Load, TPump, Fast load scripts for extracting data from various production systems.• Reviewed Stored Procedures for reports and wrote test queries against the source system (SQL Server-SSRS) to match the results with the actual report against the Datamart (Oracle)• Perform Data profiling, preliminary data analysis and handle anomalies such as missing, duplicates, outliers, and imputed irrelevant data. Remove outliers using Proximity Distance and Density based techniques.• Involved in Analysis, Design and Implementation/translation of Business User requirements.• Used supervised, unsupervised and regression techniques in building models.ENVIRONMENT: Python, SQL server, Hadoop, HDFS, HBase, MapReduce, Hive, Impala, Pig, Sqoop, Mahout, LSTM, RNN, Spark MLLib, MongoDB, AWS, Tableau, Unix/Linux.
  • Grapesoft Solutions
    Data Engineer
    Grapesoft Solutions Apr 2014 - Mar 2016
    • Worked on collection of large sets using Python scripting. Spark SQL• Worked on large sets of Structured and Unstructured data.• Worked on creating DL algorithms using LSTM and RNN.• Actively involved in designing and developing data ingestion, aggregation, and integration in Hadoop environment.• Developed Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date.• Developed SQOOP scripts to migrate data from Oracle to Big data Environment.• Extensively worked with Avro and Parquet files and converted the data from either format Parsed Semi Structured JSON data and converted to Parquet using Data Frames in Spark.• Converted all Hadoop jobs to run in EMR by configuring the cluster according to the data size• Implemented Spring security for SQL injunction and user access privileges, Used various Java, J2EE design patterns like DAO, DTO, Singleton etc.• Experience in creating Hive Tables, Partitioning and Bucketing.• Performed data analysis and data profiling using complex SQL queries on various sources systems including Oracle 10g/11g and SQL Server 2012.ENVIRONMENT: R, SQL server, Oracle, HDFS, HBase, AWS, MapReduce, Hive, Impala, Pig, Sqoop, NoSQL, Tableau, RNN, LSTM, Unix/Linux, Core Java.
  • Dhruvsoft Services Private Limited
    Senior Data Engineer
    Dhruvsoft Services Private Limited Aug 2012 - Mar 2014
    Hyderabad, Telangana, In
    • Worked on different dataflow and control flow task, for loop container, sequence container, script task, executes SQL task and Package configuration.• Created new procedures to handle complex logic for business and modified already existing stored procedures, functions, views and tables for new enhancements of the project and to resolve the existing defects.• Loading data from various sources like OLEDB, flat files to SQL Server 2012 database Using SSIS Packages and created data mappings to load the data from source to destination.ENVIRONMENT: MS SQL Server 2005 & 2008, SQL Server Business Intelligence Development Studio, SSIS-2008, SSRS-2008, Report Builder, Office, Excel, Flat Files, .NET, T-SQL.

Frequently Asked Questions about Sadia T

What company does Sadia T work for?

Sadia T works for Cummins Inc.

What is Sadia T's role at the current company?

Sadia T's current role is Big Data Engineer at Cummins Bridgeway Columbus LLC.

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.