Yuping Yang

Yuping Yang Email and Phone Number

Data Scientist @ C3 AI
Yuping Yang's Location
Dublin, Ohio, United States, United States
About Yuping Yang

Yuping Yang is a Data Scientist at C3 AI.

Yuping Yang's Current Company Details
C3 AI

C3 Ai

View
Data Scientist
Yuping Yang Work Experience Details
  • Self-Employed
    Data Scientist
    Self-Employed May 2024 - Present
    Programmed a time series forecasting software (in progress). Already accomplished components are: o used some python mathematic libraries: numpy, pandas, matplotlib, pickle, h5py, sklearn.cluster Kmeans, DBSCAN, AgglomerativeClustering, tslearn.clustering TimeSeriesKmeans, sklearn.metrics, dtaidistance dtw, fastdtw tslearn.metrics cdist_dtw o used silhouette scoring technique o used optuna's Tree-structured Parzen Estimator (TPE) Sampler to search parameter space with size > 0.5 million o used nividia GPU with Ubuntu on WSL on Windows 11, installed cuda, conda. o coded threaded asyn- processing within programs and data streaming between different programs o coded to probabilistically score the prediction accuracy o coded real time streaming of online IEX data o setup paper trading mechanism
  • C3 Ai
    Senior Data Scientist
    C3 Ai Jan 2024 - Present
    Redwood City, California, United States
    • System setup and upgrading of c3 AI platform • used VM, WSL2, linux in WSL2, docker, mac, github, python, node.js, c3 AI cloud • participate in client engagement for AI projects
  • Zerosight
    Ml Architect
    Zerosight Feb 2023 - Dec 2023
    San Jose, California, United States
    • Because of the stealth nature of the project, I do not disclose the goal and scope of the project. I function as a foundation engineer to probe the design, technical viability, and construct a POC of a complete AI based software. Here are some of the tech skills that I use in this project:• Investigated various installations of openpose, mediapipe, Yolov8, Yolo-nas, under a variety of environments: Linux, Windows, docker images and containers, Dockerfile, pre-configured images, cmake, python virtual environ, Nvidia gpu, cuda, anaconda, vlc, ffmpeg, streamlit• Used vision ML packages (object recognition, pose estimation) yolov8, yolo-nas, openpose, mediapipe, pytorch for object detection and pose estimation.• Used vision package opencv• Used pretrained models and done some model training. Used annotation software cvat to prepare labels for training.• Done affine transformations for 3D modeling: translation, rotation, quaternions, 3D transformation matrices, etc.• Used Blender to craft 3D objects and Blender scripting (python) for object movement control.• Used Unity and Unity scripting (C#)• Done socket programming for messaging: both UPD and TCP sockets, message between (1) python and python (in Blender) and (2) python and C# (in Unity)• Thread and aync coding (C#) for decoupling messaging and object control• Investigated data storage for storing and searching large amount of image data
  • Meta
    Data Scientist
    Meta Dec 2021 - Jan 2023
    Menlo Park, California, United States
    1. Done analysis to optimize marketing campaigns parameters (presto SQL/Hive/notebooks).Details: initially to analyze the effect of bid cap, through investigation in database. Bid Cap is one of the ad parameters. During investigation, I found that many other parameters affect the campaign effect, and can be formulated as a high dimensional optimization problem. Used python plotting libraries, presto SQL, Hive.2. Analyzed Marketing result and write report of findings (presto SQL/Hive/statistics/notebooks)Details: marketing campaigns are conducted as A/B test. The results of campaigns are recorded in database. Need to find lift, reach, impression, CTR, OR, in different channels, objectives, and verticals, partner conditions, etc. Any significant discovery need to be backed by statistical significance. This analysis uses z-test, t-test, p-value, winsorization, CUPED, programmed in python/R/Presto SQL on Jupyter style platform) for marketing managers.3. Machine Learning: define audience for marketing campaigns(ML/AI, used PyTorch/Hive/notebooks)Details: based on marketing manager’s marketing campaign design, use ML (PyTorch) to identify target customers suitable to be included in the campaign. This involves label (target measurement of optimization) design, data preparation, model design, and training. The trained models are used to filter customers that are most responsive to the marketing campaign.4. Wrote Data pipelines to support marketing campaigns (Presto SQL/Hive/Python/pipelines).Details: A pipeline (in META) is a python program that combines multiple SQL statements in a logical way, each SQL is in Presto (a big data query engine) SQL and run on Hive data store. The work involves run presto SQL queries on Hive (the effect of presto SQL needs to accomplish what are specified in the marketing campaign design document, from marketing managers), and then combine the SQLs into a data pipeline, and schedule the pipeline run.
  • Comresource
    Senior Technical Staff
    Comresource Jun 2019 - May 2021
    Columbus, Ohio
    Work as a senior technical member in data solution team, with diverse duties and tasks• Data Mining (Text Data Pattern Matching and Apriori): Independently built a program to find more than 1000 useful patterns with over 99% accuracy in a database. To compare, previously, two analysts worked over 6 months and found 700 patterns with similar data. • Text Data Classification: Independently built a program to classify product tier in a database with product info. This program replaced previous manual work. The effect of a single run is equivalent of two-people performing 12-months of manual work.• Text Parsing: Independently built a program to parse US Patent Office PDF documents (>50K docs) to do OCR and extract relevant patent citations information with 95-97% accuracy. These docs are prepared by USPTO patent examiners individually and manually. OCR is the first step in this data processing pipeline.• Data Search Tool: Independently built a MS SQL database search tool with a Windows GUI (PyQt, python). It can be used to visually search and display aspects of a relational database, such as table relationships, relevant keywords, data contents, etc.• Graph Data Visualization: Built a visual interface to display lines & nodes representing graph data in a SQL Graph database.Coding work are python coding with PyQt (GUI), D3 Js (Web node graph), tesseract (OCR), re, nlp, apriori (data mining), relational modeling and graph data modeling
  • Code Track
    Founder
    Code Track Jun 2016 - Mar 2019
    Columbus, Ohio, United States
    A personal effort to get funding for a software I created.(1) Coded a data lineage search software for use with MSSQL, done with MVP (similar to www.getmanta.com), 15k lines python.(2) Coded a socket-based office document sharing software, done with MVP, 5k lines of pythonBoth programs used PyQt GUI and PostgreSQL as backend(3) Create a Django based web site(4) Tried tensorflow,, in an effort to incorporate its capability into final products(3) Solicited funding, wrote BPs, and made presentations to potential investors
  • Signet Accel, Inc.
    Python Developer
    Signet Accel, Inc. Apr 2015 - Mar 2016
    Columbus, Ohio
    Wrote a versatile data converter in python, for loading medical data from diverse sources of various hospital systems into a universal schema OMOP. data sources contain medical codings from SNOMED, CPT-4, and ICD9.program is based on python, with csvkit and lucene search engine.
  • J.P. Morgan
    Software Developer
    J.P. Morgan Apr 2013 - Mar 2015
    Columbus, Ohio
    (1) debug and modified a software for monitoring financial transactions.(2) Investigated feasibility for data level linking DB2 based system to Oracle based system(3) Wrote a complex Java program to convert data formats.(4) reverse engineered the logic of an old system as preparation for building new.(5) Actimize rule engine study and POC as preparation for building new. (6) Investigated historical trading records in supporting of a well publicize financial case(7) Done website security related work
  • Research Institute At Nationwide Childrens Hospital
    Data Architect
    Research Institute At Nationwide Childrens Hospital Nov 2011 - Mar 2013
    Columbus, Ohio
    Re-architected a funding tracking DW for full life cycle tracking of research applications and projects.Work involving investigating the old system, current projects and people usage information, and data modeling, and ETL, etc.
  • G2O
    Senior System Analyst
    G2O Jan 2006 - Nov 2011
    Columbus, Ohio Metropolitan Area
    Misc consulting projects. Most were ETL, Data Warehouse or BI related. Roles in each project involved client project proposals, scoping, budgeting and implementation (70%)Selected Projects:• Etech, an Ohio government agency: Reorganize of state-owned buildings and teacher data.• 31gifts Co.: SQL Server Replication and ETL work using SSIS.• Ohio Department of Development: ETL using T-SQL and SSIS.• Cardinal Health: Scan code patterns in a large repository of deployed programs using Perl.• Ohio Department of Insurance: Built BI Platforms and two data marts using SSIS, SSRS, SSAS.• Ohio Public Employees Retirement System: Built a data mart POC, star schema, 2 months project.• Accent Energy Co: Redesign, and reorg of databases, ETL work used C#, T-SQL, SSIS.• Commercial Vehicle Group, Inc.: ETL work for an inventory control system from sub-companies around the world, to a central SQL Server using T-SQL.• Office of Student Affairs of Ohio State University: Sensitive data protection from web access. • Ohio Rehabilitation Service Commission: Built a data mart POC using a star schema, C, and Unix shell script.• Wendy's International: ETL tool evaluation for client comparing DataStage, Informatica Power Center, Microsoft SSIS, Sunopsis, Ab initio, iWay, and DataMigrator by building test cases.
  • Ge Aircraft Engine
    Data Architect
    Ge Aircraft Engine Jan 2005 - Mar 2006
    Cincinnati, Ohio, United States
    • Worked to support the integration of several very large databases during the merge of GE Aircraft Engines and GE Locomotives• Conducted extensive data analysis and re-designed the data model of databases and data warehouses of GE Transportation (Parent company of GE Aircraft Engine and GE Locomotive) into six subject areas: Human Resource, Financial, Contracts, Products, Engine, and Parts. The work involved data modeling of hundreds of tables in over thirty interconnected databases.
  • Ruiling Optics Llc
    Founder
    Ruiling Optics Llc Jan 2003 - Dec 2004
    Columbus Metropolitan Area, Ohio, United States
    • Proposed a ultra-fast flat-bed scanning device, applied and obtained patents, conducted research on imaging software and hardware, and did many funding raising activities• Created a high-speed, large-area flatbed scanner using a matrix of optical sensors for split-second document scanning with visual correction. Researched numerous patents, optical imaging devices, and embedded programming. Applied and obtained three US utility patents.
  • Profitlogic
    Principal Engineer
    Profitlogic Oct 2001 - Oct 2002
    Cambridge, Massachusetts, United States
    • Profitlogic is a software startup aimed at forecasting prices for retailers. JC Penny project ($15m) is the last and the biggest project done by this company before it sold to Oracle. I joined at start of this project.• Designed a large database (At the time, a few Trillion bytes, in today’s terminology, it is more close to a data lake, with core parts having relational schema as well as star schemas), leading a team of 7 developers to build the database, which is part of the pricing computing engine of the $15m project.• Led a team of seven Engineers in building a large data warehouse for JC Penny's pricing optimization system, handling 3.5 million active SKU's across 1,020 stores. The project was implemented on an Oracle database, with an initial data load of around one TB. Utilized UNIX Shell Scripting, Toad, Oracle 8i, PL/SQL, Pro*C, and Erwin in the project.
  • Electron Economy
    Lead Engineer
    Electron Economy May 2000 - Oct 2001
    Cupertino, California, United States
    • Electron Economy is a software startup funded with $86m venture funding. I was hired without a role assignment and my role was determined 2 month after I joined the company, when I proposed an automatic, modularized, online, rule-based intelligent business transaction (negotiation) system. Based on my proposal, a team of 15 engineers was formed, with me as the lead engineer. All team members have PhDs and masters and are developers with various skills. I helped form the team and structured the project. When the company is sold to viewlocity, the rule-based transaction engine is one of the two main software assets that sold along with the company.• Conducted supply chain visibility analysis and proposed a real-time online business transaction project based, leading to the establishment of the Intelligence Commerce Engineering (ICE) division (members: 10 PHDs, 5 MSs).• Collaborated on Java software design and implementation.• Created a reporting data mart for the central database, utilizing ERwin, Pro*C, C, Java, and Oracle

Yuping Yang Education Details

Frequently Asked Questions about Yuping Yang

What company does Yuping Yang work for?

Yuping Yang works for C3 Ai

What is Yuping Yang's role at the current company?

Yuping Yang's current role is Data Scientist.

What schools did Yuping Yang attend?

Yuping Yang attended The Ohio State University, The Ohio State University, The Ohio State University, 复旦大学.

Not the Yuping Yang you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.