Kai Yang

Kai Yang Email and Phone Number

A NLP Engineer
Kai Yang's Location
Shanghai, China, China
About Kai Yang

I am Kai, a data scientist based in China. My research primarily focuses on Recommendation System and Large Language Models (LLM). I excel in a technical stack that includes proficiency in LLM technologies such as Llama, Alpaca, ChatGLM, Wizard-LM, and Mixtral. Additionally, I possess skills in related technologies like Lora, QLora, Deepspeed, TGI, VLLM, GGML, Rope, Alibi, ptuning, adapter tuning, adapter drop, and COT.With over several years of work experience, I have worked in major Chinese internet companies, my work projects encompass various domains, such as LLM platforms, dialogue systems, risk control systems, recommendation system and more.

Kai Yang's Current Company Details

A NLP Engineer
Kai Yang Work Experience Details
  • Ubix
    Nlp Engineer
    Ubix Apr 2023 - Mar 2024
    Shanghai
    SQL Auto-Generation Large Model PlatformResponsible for training the SQL large model and prompt engineering based on the DIN-SQL framework. 1. Data preparation includes self-structured and manually annotated data, as well as open-source data (Spider). The data format consists of (Question, Schema_links, Step_by_Step, final SQL). Route prompts include general questions, easy prompts (simple SQL without join), mid prompts (two-table join), and hard prompts (multiple table join).2.Model training involves using CodeLLama34b for LoRA training, covering tasks such as route classification, SQL generation, and general text generation.3. Model deployment utilizes TGI and VLLM frameworks for routing user questions (non-SQL/easy/mid/hard), generating schema_links (containing fields and join relationships used in SQL), and finally incorporating the generated results into prompts to produce the final SQL
  • Tongcheng Travel Holdings Limited
    Recommendation Algorithm Engineer
    Tongcheng Travel Holdings Limited Jul 2020 - Nov 2022
    Shanghai
    HuiXing cross-recommendation system: Optimize the revenue and user conversion rate in the transit route, use the combination of XGB and deep learning to solve the pain points of users' travel difficulties during peak periods and low success rate of ticket grabbing, and recommend various types train plus train etc. for users under different search categories. Combination of multiple routes such as train + plane, increase user click rate and order rate;Artificial Intelligence Customer Service: Through user intention identification and understanding, word segmentation and sentiment analysis are performed based on basic features in historical data. For single-round QA and multiple tasks, build an OTA industry knowledge and label system, use FastText and Bert offline training to achieve an accuracy rate of more than 90%, perform Query retrieval and recall for different user intentions, and pass the possibility of candidate intentions. Higher reordering fusion rules increase the matching of cold starts and high frequency problems;UGC Community building: Build the UGC content platform of Tongcheng Mini Program from scratch, start from tag annotation and user travel records, identify travel boutique stickers in the early stage and distribute them to the corresponding city and age group user groups, dynamically calculate user interests through MAB and recall the most suitable content, use the generalization index of each stage to train the DNN model under multi-objective to solve the content distribution and sorting;
  • Oyo Hotel(Masayoshi Son Invest)
    Senior Data Engineer
    Oyo Hotel(Masayoshi Son Invest) Apr 2019 - Mar 2020
    中国 上海市
    Add user imei identification, verification code page operation and click behavior, model matching, IP fluctuation positioning, GPS black product aggregation, startup and running time, add unsupervised model on the basis of lightgbm, filter features based on varclustering, and generate iforrest tree model Identify the results, intercept 20-40% of registered black products every day;Create a fake coupon transaction model and a dynamic coupon issuance model. There are two types of models: new customers and regular customers. The regular customer model is mainly used to predict whether to place an order again. Confirm the amount of coupons issued for thenext event and whether there is a threshold. The bottom layer of the two types of models uses LSTM and XGB to output normalized scores, and the upper layer uses LR to improve interpretability. After going online, increase the conversion rate of orders by 10%, and the order rate of customers who issue coupons is high;According to the ResNet model, it only costs one camera in each hotel to monitor and identify the number of people who deliver ID cards backand forth at the hotel front desk every day in real time, to compare the number of orders entered by hotel owners on the PMS platform, and to further accurately identify false hotel transactions level, and make reports to visualize and analyze abnormal fluctuations of hotel OCC indicators;
  • Kuainiu Group(Jing Dong Invest)
    Head Of Data Analysis
    Kuainiu Group(Jing Dong Invest) Apr 2018 - Mar 2019
    中国 上海市
    Recommendation ranking: Using the user's app installation list and the demand preferences of different loan products, according to the historical user's click habits, using clustering and classification algorithms, depicting different portraits of users to improve the user's product purchase conversion rate by 20%;Social network: For the first time in our company, we use the app tracking point and user operator data to code 0 and 1 for users through the data funnel method, and then add the A-card score. Each time, the user's social score is updated according to the weight of the social network. User social risk system, and add B card and C card as features, about iv0.1;Warehouse Building: Responsible for the design, construction, optimization and integration of risk control data warehouse, including demographic characteristics, behavior data and loan data, among 10 tables with over 2000+ fields. Greatly improve the modeling efficiency;Anti-fraud model: Responsible for the anti-fraud project of China Everbright Bank, building process with createing a total of 15 wide hive views. Through preliminary analysis and cross-validation, 10+ feature multi-dimensional subjects are designed, 5000+ derived variables are created through FRM clustering, ylabel is established semi-supervised, and text-type features are classified;
  • Nami Finance
    Data Analyst
    Nami Finance Apr 2017 - Apr 2018
    中国 上海市
    SQL fetching, variable ETLization, variable screening, Python development of logistic regression model, a total of 5 versions of the model were established, including pre-loan and post-loan scorecards. The repayment rate is above 91%;Mining nearly 20 valid strong tags from third-party data sources, user text messages, and bank flow, and putting them into the pre-loan scorecard model, the overall AUC of the pre-loan scorecard is increased by 1%, and the ks value is increased by 2%;
  • Jiang Taigong Sunshine Private Fund Co., Ltd
    Trader
    Jiang Taigong Sunshine Private Fund Co., Ltd Aug 2015 - Jan 2017
    中国 上海
    Trade on the second market for stocks
  • Standard Chartered Bank
    Intern
    Standard Chartered Bank Jul 2014 - Aug 2014
    Yuan Qu Branch
  • Bank Of China
    Intern
    Bank Of China Jul 2013 - Sep 2013
    Xiang Zhang Branch
    Foucs on the basic bank business, customer guidance with simple management and sales promotion of finanicial products.

Kai Yang Skills

Microsoft Office Asset Management Stock Trading Future Leader In Standard Chartered Php Unysjy

Kai Yang Education Details

Frequently Asked Questions about Kai Yang

What is Kai Yang's role at the current company?

Kai Yang's current role is A NLP Engineer.

What schools did Kai Yang attend?

Kai Yang attended East China University Of Science And Technology, Xi'an Jiaotong-Liverpool University.

What skills is Kai Yang known for?

Kai Yang has skills like Microsoft Office, Asset Management, Stock Trading, Future Leader In Standard Chartered, Php, Unysjy.

Not the Kai Yang you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.