Vivek Kumar Singh

Vivek Kumar Singh Email and Phone Number

Principal Architect , Data Products @ Tracy, CA, US
Tracy, CA, US
Vivek Kumar Singh's Location
Tracy, California, United States, United States
Vivek Kumar Singh's Contact Details

Vivek Kumar Singh work email

Vivek Kumar Singh personal email

Vivek Kumar Singh phone numbers

About Vivek Kumar Singh

Connect to me for everything {Data} - Data Strategy, Data Migration, Transactional Warehouse, Modern Data Architecture, Streaming Data, Feature Store, Data Quality/Lineage/GovernanceFounder of open source data quality project https://sourceforge.net/projects/dataquality/Databricks badge - https://credentials.databricks.com/7f1ca732-aa34-478d-8841-5e66d1e7a96bBlog site: https://viveksingh36.wordpress.com/I have unique blend of building highly scalable, high availability, data products using batch mode, real-time, Lambda and Kappa architecture. I have covered all major blocks of data life cycle - semantic layer, data ingestion, data quality, data processing, data science, master data management, data governance.

Vivek Kumar Singh's Current Company Details
Crux, Prudentials

Crux, Prudentials

Principal Architect , Data Products
Tracy, CA, US
Vivek Kumar Singh Work Experience Details
  • Crux, Prudentials
    Principal Architect , Data Products
    Crux, Prudentials
    Tracy, Ca, Us
  • Crux
    Principal Architect , Data Products
    Crux Mar 2023 - Dec 2023
    San Francisco, California, Us
    Crux Informatic is an external data monetization company which had complex custom pipelines on GCP (Google Cloud ) to connect producers and consumers. Here is a list of work that I did with Crux. Build and Socialize “To Be” data architecture with CXO level. Optimizing Crux Data Domain Model (Unified model to take care of schema evolution, Different scheduling , multiple consumer push) Re-architected the cloud storage, to save cost. Design and Implemented Databricks Market-Place for Crux data. o Technology used: Unity Catalog, Access Control, Databricks SDK API, Concurrent Job Handling Implemented the data pipeline processing on Databricks from AWS S3 o Technology used: Notebook, SQL Warehouse, External Table Built Google Analytics Hub (GAH) pipeline to push external data into GAH Did successful PoC with Jython and GraalVM to implement low code( JS and Python) framework into Crux External Data Product (EDP) platform Developed customized SCD type 2 pipeline using Databricks and Snowflake Implemented a data validation pipeline using DataProc, AWS PyDeedu
  • Amazon Web Services (Aws)
    Senior Solution Architect, Aws Data Lab
    Amazon Web Services (Aws) Oct 2021 - Mar 2023
    Seattle, Wa, Us
    I am a Senior Architect with AWS Data Lab team. AWS Data Lab, which was part of AWS development team , used to do design and build Lab for AWS customers. I work with AWS customers in the Data Lab to provide them, low level reference architecture (Design Lab) and build the reference use cases for the data products (Build Lab). On an average , an architect does one lab every month. It involves understanding of usecases, interacting with data leaders of the enterprise and planning for the lab. Data Lab used following AWS services for building data products - Glue, Lake Formation, EMR, Redshift, QuickSight, SQS, SNS, StepFunctions, Kafka, Kinesis, Apache Spark I have done over 14 Labs with customers. They fall into following domains: Data Strategy and Modern Data Stack Cloud Migration from On-Prem SQL Migration to Data Pipelines Transactional Warehouse Building Scalable Real time processing pipeline Scalable WarehouseBlog - https://aws.amazon.com/blogs/big-data/handle-upsert-data-operations-using-open-source-delta-lake-and-aws-glue/
  • Relevance Lab
    Enterprise Data Architect & Senior Director
    Relevance Lab Jan 2016 - Oct 2021
    San Jose, Ca, Us
    I do hands-on consulting for enterprises on Data Strategy, Data Quality, Data Ingestion , Big Data Migration to Cloud, Master Data Management, Operationalizing of ML Models.Databricks badge - https://credentials.databricks.com/7f1ca732-aa34-478d-8841-5e66d1e7a96bSome of my engagements are :-- Datalake/Lakehouse creation (on-prem on HDFS) and on AWS/Azure S3/Lakeformation for a large bank and a media company (PB of Data, Security, Access control )- Data Ingestion and ML/Data Ops for a large commercial bank in USA ( Batch and streaming mode)- Migration of Python code to Scala API based spark- Fine tuning of Spark code and spark-submit- Large scale data quality and compliance implementation for a state government of India - Apache spark based data processing framework for a large Pharma company in USA- Migration of SQL to Spark SQL and core Spark- AWS Glue based data processing framework for a large Pharma company in USA- Autonomous Signal ingestion and Transformation/Attribution (spark based) for a luxury vehicle brand- Categorization of profiles and matching with target group using NLP and Machine learning- ML based anomaly detection for credit card fraud Feel free to contact me [ vivek_ks @ hotmail[.]com ] if you need consulting or custom solutions for Data Products.Platform - Core Java, Spark , Databricks, Micro Services, AWS (S3,Glue, RDS, EMR,Redshift) , osDQ, Hadoop, Hive
  • Comcast
    Consultant Principal Architect - Athena Comsast Advance Advetisement Group
    Comcast Aug 2014 - Dec 2017
    Philadelphia, Pa, Us
    Process 10 TB data / day - designed from data ingestion to insight and everything in between.Data Source - Viewership data, PII data, Customer data , Advertisement data log, 3rd Party dataData Lake Creation, Data Ingestion, Apache Spark based data processing, Data Quality, Client Solutions, Java, Data Products, Information architecture, Strategist to migrate to big data architecture Data Science - Spark ML libs - linear regression, multilinear regression, K-mean, Fuzzy matches ( cosine distance), logistic regression, random forest, data pipeline, segmentation, micro segmentation, recommendation , collaborative filteringAmazon Cloud - S3 push and Pull, EMR processing
  • Infosys
    Principal Product Architect - Prd Group
    Infosys Oct 2012 - Jul 2014
    Bangalore, Karnataka, In
    1.) Private PaaS evaluation and standardization of Big data stack2.) Developed and productise algorithms for search and recommendation for big data using metric factorization3.) Defining Restful API framework for data quality components4.) Mentoring junior data architects and data scientists5.) Did a PoC on image based recommendation using openCV
  • Osdq Based Data Quality
    Partner
    Osdq Based Data Quality Apr 2012 - Sep 2012
    I started on my own in big data analytic space joining hands with Graymatter where we set up competency center for big data and creating data quality framework for them and bid to clients for our solution
  • Yahoo
    Principal Architect - Yahoo !
    Yahoo Aug 2009 - Apr 2012
    Sunnyvale, Ca, Us
    Data Architect for Yahoo! S&E group.Yahoo! is world leader in internet technology. Yahoo data centers collect huge amount of system data that is different from user data and user behavioral data. System data size is around 15 TB /day. As architect my responsibilities are: Design and Implement end to end BI solution (Big data analytics) for System groups within Yahoo. Define High Availability, Security and Scalability of the System.  Defining Physical Layout and Topology of System Deployment of Microstrategy(Reporting tool) into production environment. Selection of the ETL tool. We evaluated IBM data stage, Talend & informatica Defining SLA, Business continuity plan, roles and responsibility within architecture. Define data storage system for 500 TB of data. (Oracle RAC 11g) Define grid solutions (hadoop, pig, hive,ozzie) for 2 PB of raw storage. Write Data Integration software for Grid (Hadoop) and yahoo internal products. Decide the security layout, bandwidth and physical layout of architectural components. Make presentations to executives about cost and hardware/software approval. Interaction with Yahoo internal team to integrate their products into architecture. Implement Microsoft Bing data integration and latency reporting system. Designing the Grid reporting framework using jersey (SOA) based framework to do reporting directly from Grid.
  • Pearson
    Enterprise Data Architect
    Pearson Mar 2008 - May 2009
    London, Gb
    World leader in student assessment. It has scores of heterogeneousApplications and Data sources, which support the business. As Enterprise data architect my responsibilities were:Define Canonical Data Reference model used by SOAIdentify and Define Business Entities and data flowDefine Persistence and Data Service layerDefine data integration and transformation layerDefine best practice for Data Storage and Data Presentation ( Reporting)Define interaction with Business Services and Business Entities Define data governance framework and data model groupDefine Data usage modelLooking into Master Data Management
  • Razorsight
    Principal Bi And Data Architect
    Razorsight Jul 2007 - Mar 2008
    Reston, Virginia, Us
    Razorsight is VC funded company which is developing products for Telecom spend management. It is automating reading of invoice and developing a framework to manage invoices, payment audit and dispute resolution. My responsibility was to look after BI components. Some of the major works I was involved were:Developed Algorithm for parsing text based invoices and loading into their data modelDeveloped Data Dictionary for AnA data model.Developed code for Similar Supplier check (data quality) on top of Lucene (Open Source)Provided framework for Business Object (Reporting tool) Security implementation and backupWorking on RazorSight eLoader java project which has five major parts - Extraction, Validation and Transformation, Loading, System Integration and Monitoring and Security.Working on OLAP creation from transactional database for future reportingWorking on Data Governance roadmapTransition from Transaction modeling to Dimension Modeling
  • Seibel
    Project Manager
    Seibel Jan 2006 - Jun 2007
    Mountain View, Ca, Us
    I was managing delivery of HiTec, Consumer Goods and Automobile verticals (part of Manufacturing and Distribution -M&D group) of Siebel - world leader in CRM. I am also spear heading Business Intelligence Special Interest Group (SIG) at Symphony Services. Some of my major works apart from smooth delivery are -Productization of Capco legacy system. My recommendation involvesData Model changeRow level Data SecurityData LoadingWorked closely with Data Management Group of SIEBEL for vertical requirementsRelation between new table and columnConditional fetching of dataData capacity planningWorked as Moderator for design reviews in Fagan style reviewAcceptance of new featuresLow Level Design reviewTriage of bugs
  • Business Objects
    Senior Project Manager
    Business Objects Mar 2003 - Jan 2006
    New York, Ny, Us
    Business Objects is leader in Business Intelligence domain with several product lines like business Objects Reports XI, Data Integrator, Application Foundation, Set Analyzer. My project is accounted for Application Foundation (EPM), Set Analyzer and Data Integrator on Unix platforms (AIX, Solaris, HP) and certification on different configuration of web application servers and databases. We are also responsible for resolution for bugs and configuration issues for clients.Following were my responsibilities:Delivery of UNIX ports (SUN, AIX, and LINUX) for Application Foundation product line Delivery of Data connectivity for OS390 and Sybase - customer funded projectDelivery of Data Integrating (ETL ) version 6.5.1, 11.0.1.1, 11.0.2.5Owning bug delivery and environment issues of presalesMigration of MicroStrategy7i to Business Objects 6.5.1for Littlewoods, UKSetting up India team for Professional Service Offering (PSO) for Business Objects Consulting onsite team to setup Wyeth, USA BI competency center
  • Oracle
    Senior Member Of Technical Staff
    Oracle Jul 2000 - Sep 2002
    Austin, Texas, Us
    I was primarily responsible for IBM-AIX releases and integration of discoverer with iAS, bug fixing of Discoverer Server. I was also involved in project management and quality processes of Discoverer team apart from product tuning and performance enhancement exercises. From April 2002 I am also given added responsibility BUG Coordinator. As bug coordinator I am supposed to do the initial screening of the bug, assign to the appropriate person and help them to resolve the bug.Environment: C, java, Unix Internals, Clear case,C++, Perl
  • Dow Jones
    C/C++ Unix Developer
    Dow Jones Apr 1998 - Jun 2000
    New York, Ny, Us
    Involved re-architecture of Dow Jones Information Distribution System (IDS). IDS use newswire technology to distribute information and products to their domestic and international clients. Each client is attached either to a TCP channel or X.25 line that is used for transmission and retransmission of required data. Administrator sets transmission rules for each client. Client can request retransmission of previous data.My Role: I was involved in design and implementation of multi-threaded, multi-thread safe, multi-user administrative interface for IDS.
  • Infosys
    Software Engineer
    Infosys Jun 1996 - Feb 1998
    Bangalore, Karnataka, In
    Trinity: ADEPT (Analysis, Design, Extraction and Performance Tool):Trinity, which is developing a product, called ADEPT (Analysis, Design, Extraction and Performance Tool) for Mobile Telephone Exchange (MTX) switches. It is used for switch analysis, dimensioning, forecasting etc. Environment: X/Motif, UIMX, C++

Vivek Kumar Singh Education Details

  • Indian Institute Of Technology, Kanpur
    Indian Institute Of Technology, Kanpur
    Engineering
  • Netarhat School, Bihar-Jharkhand
    Netarhat School, Bihar-Jharkhand
    Matriculation
  • Science College Patna
    Science College Patna

Frequently Asked Questions about Vivek Kumar Singh

What company does Vivek Kumar Singh work for?

Vivek Kumar Singh works for Crux, Prudentials

What is Vivek Kumar Singh's role at the current company?

Vivek Kumar Singh's current role is Principal Architect , Data Products.

What is Vivek Kumar Singh's email address?

Vivek Kumar Singh's email address is vi****@****ail.com

What is Vivek Kumar Singh's direct phone number?

Vivek Kumar Singh's direct phone number is (650) 506*****

What schools did Vivek Kumar Singh attend?

Vivek Kumar Singh attended Indian Institute Of Technology, Kanpur, Netarhat School, Bihar-Jharkhand, Science College Patna.

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.