P M Email and Phone Number
P M is a Big Data Developer at Prime Therapeutics.Actively looking for new opportunities on C2C at Prime Therapeutics.
Prime Therapeutics
View- Website:
- primetherapeutics.com
- Employees:
- 3535
-
Big Data DeveloperPrime Therapeutics Jan 2019 - PresentEagan, Minnesota, United StatesResponsibilities:• Developed custom input adapters in Java for moving the data from raw sources (FTP, S3) to HDFS.• Developed Spark applications using Scala to perform data cleansing, data validation, data transformations and other enrichments.•o consume the data from Kafka using Kafka consumer group. Ingested claims from Kafka to Kudu using NIFI Processors• Data pipeline consists Sqoop, custom build Input Adapters, Spark and Hive.• Worked on performing Hive modeling and written many hive scripts to perform various kinds of data preparations that are needed for running machine learning models.• Worked on working prototype to build a real time workflow for streaming the user events from external applications. • Utilized Kafka and Spark Streaming for building the real time pipeline.• Worked on converting existing map-reduce jobs to spark jobs.• Developed Airflow workflows to automate and productionize the data pipelines. -
Azure Big Data EngineerCentene Corporation Feb 2018 - Dec 2018St Louis, Missouri, United StatesResponsibilities:• Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.• Worked with respective business units in understanding the scope of the analytics requirements.• Performed core ETL transformations in Spark.• Automated data pipelines which involve data ingestion, data cleansing, data preparation and data analytics.• Created end to end Spark applications using Python to perform various data cleansing, validation, transformation and summarization activities on user behavioral data.• Converted existing MapReduce jobs into Spark transformations and actions using Spark RDDs , Data frames and Spark SQL API• Developed end-to-end data pipeline using FTP Adaptor, Spark, Hive and Impala.• Used Python to write code for all Spark use cases.• Implemented design patterns in Scala for the application.• Implemented Spark using Scala utilized Spark SQL heavily for faster development, and processing of data. • Exploring with Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, YARN.• Worked on various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive and Map Side joins.• Created Oozie workflows and coordinators to automate data pipelines daily, weekly, and monthly.Environment:Azure, HDFS, Hive, Sqoop, Flume, Spark, Python, HBase, Kafka, Impala, Oozie, Oracle 11g, YARN, UNIX Shell Scripting, Agile Methodology -
Hadoop DeveloperCiti Bank Vietname Feb 2016 - Dec 2017Irving, Texas, United StatesResponsibilities:• Worked with the business team to gather the requirements and participated in the Agile planning meetings to finalize the scope of each development.• Responsible for building scalable distributed data solutions on CDH.• Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.• Implemented data pipelines developing multiple mappers by using Chained Mappers API.• Developed multiple MapReduce batch jobs in java for loading the data to HDFS in sequential format.• Ingested structured data from wide array of RDBMS to HDFS as incremental import using Sqoop.• Involved in writing Pig scripts to wrangle the raw data and store it to HDFS, load the data to Hive tables using HCatalog. • Configured Flume agents on different data sources to capture the streaming log data from the web servers.• Implemented Flume (Multiplexing) to steam data from upstream pipes in to HDFS. • Created Hive external tables with clustering and partitioning on the date for optimizing the performance of ad-hoc queries.• Involved in writing Hive QL scripts on beeline, impala, hive cli for the consumer data analysis to meet business requirements. • Exported data from Hive to DWH using Sqoop.• Worked with different file formats and compression techniques to ensure optimal performance of hive queries.• Involved in creating Hive tables from wide range of data formats like csv, text, sequential, Avro, parquet, orc, JSON and custom formats using SerDe.• Transformed the semi-structured log data to fit into the schema of the Hive tables using Pig.• Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.• Involved in testing and designing low level and high-level documentation for the business requirement.Environment: Cloudera Hadoop, Eclipse, Map Reduce, java, Sqoop, Pig, Oozie, Hive, Flume, Cent OS, MySQL, Oracle DB. -
Hadoop EngineerT-Mobile Dec 2014 - Jan 2016Frisco, Texas, United StatesResponsibilities:• Imported Data from different Relational Data Sources like Teradata, Oracle to HDFS using Sqoop. • Imported Bulk Data into HBase tables using Map Reduce programs.• Inserted time series data in HBase using HBase Java Api.• Designed and implemented Incremental Imports into Hive tables.• Used Rest ApI to Access HBase data to perform analytics.• Worked in loading and transforming large sets of structured, semi structured and unstructured data• Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume• Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.• Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way• Experienced in managing and reviewing the Hadoop log files.• Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS. • Worked with Avro Data Serialization system to work with JSON data formats.• Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.• Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.• Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.• Worked on Oozie workflow engine for job scheduling.Environment: Hadoop, Horton works, HDFS, Map Reduce, Hive, Teradata, Oozie, Sqoop, Pig, Java, Rest API, Maven, MRUnit,Junit. -
Java/Hadoop DeveloperExpress Jan 2014 - Nov 2014Ohio, United States• Developed RESTful Web services for transmission of data in JSON/XML format.• Involved in writing SQL queries, functions, views, triggers, and stored procedures and also using Oracle relational database. Environment: Java, J2EE, Eclipse, JSP, Servlets, Spring, JavaScript, HTML, RESTful, shell scripting, XML, -
Java/Hadoop DeveloperFreshworks Oct 2012 - Dec 2013Responsibilities:• Developed web applications by coordinating requirements, user stories, use cases, screen mockups, schedules, and activities.• Work closely with client business stakeholders on Agile development teams. • Support users by developing documentation and assistance tools. • Developed presentation using Spring Framework and used multiple modules in Spring like, Spring MVC, JDBC • Implemented Web-Services to integrate between different applications components using RESTful using Jersey. • Developed RESTful Web services for transmission of data in JSON/XML format.• Involved in writing SQL queries, functions, views, triggers, and stored procedures and also using Oracle relational database. • Used Sqoop to ingest structured data from Oracle database to HDFS. • Involved in writing and running Map Reduce batch jobs using java for data wrangling on the cluster. • Developed map side, reduce side joins using Distributed Cache on various data sets. • Developed Pig Latin scripts to transform the data according to the business requirement.• Developed Pig UDFs extending eval, filter functions using java to filter semi structured data. Environment: Java, J2EE, Eclipse, JSP, Servlets, Spring, JavaScript, HTML, RESTful, shell scripting, XML, Oracle 10g, Cloudera Hadoop, Map Reduce, Pig, HDFS.
Frequently Asked Questions about P M
What company does P M work for?
P M works for Prime Therapeutics
What is P M's role at the current company?
P M's current role is Big Data Developer at Prime Therapeutics.Actively looking for new opportunities on C2C.
Who are P M's colleagues?
P M's colleagues are Sonya Richardson, Tracy Chapple, Sandra Thomason, H. R. Harty, Melissa Vosper, Manuel Chavez, Esq., Azra Sokocevic, Mba.
Not the P M you were looking for?
-
4gmail.com, intevac.com, goarmy.com, us.army
Free Chrome Extension
Find emails, phones & company data instantly
Aero Online
Your AI prospecting assistant
Select data to include:
0 records × $0.02 per record
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial