Jack Chih-Hsu Lin, Phd Email & Phone Number
Who is Jack Chih-Hsu Lin, Phd? Overview
A concise factual answer block for searchers comparing this professional profile.
Jack Chih-Hsu Lin, Phd is listed as Software Engineer at Google, a company with 479 employees, based in San Francisco Bay Area, United States, United States. AeroLeads shows a matched LinkedIn profile for Jack Chih-Hsu Lin, Phd.
Jack Chih-Hsu Lin, Phd previously worked as Technical Writer at Towards Data Science and Lead Data Scientist, Generative AI at C3 Ai. Jack Chih-Hsu Lin, Phd holds Doctor Of Philosophy (Phd), Quantitative Computational Biology, Gpa 4.0 from Baylor College Of Medicine.
Email format at Google
This section adds company-level context without repeating Jack Chih-Hsu Lin, Phd's masked contact details.
Review company-level records connected to Jack Chih-Hsu Lin, Phd before choosing the right outreach path.
About Jack Chih-Hsu Lin, Phd
I am a lead data scientist at C3.ai. I have hands-on experience in ML (7 years) and DL (5 years) using Python (7 years). I build end-to-end DS/ML products for internal projects and external customers. I define problems and metrics; I devise/prototype solutions and productionize code. As a senior role, I lead the junior data scientists and communicate with software engineers, project managers, product managers, and UI designers. I also write weekly reports to executives and explain DS approaches to customers and stakeholders. From a PhD program to a tech company, I have developed predictive models for various types of data including tabular data, images, text, graphs, and have ranked as top 3.0% (26/866), 4.9% (188/3,835) and 6.1% (201/3,274) in 3 Kaggle machine learning worldwide competitions (overall rank: top 3.4%, 6,571/194,359). I also have invented, implemented, and published a new and interpretable neural network algorithm that converges 35% faster, reduces 200 times of parameters, and performs similarly to (AUROC>0.88) traditional neural network. With skills in data science, machine learning, and deep learning, I cannot wait to solve all these interesting real-world problems.• Languages: Proficient in Python (7 yrs), familiar with MySQL, Shell• Statistical analysis and hypothesis testing (NumPy, SciPy)• NLP: NER (Stanford NER, Hugging Face Transformer BERT, spaCy), fuzzy matching, TF-IDF, Word2vec• ML (7 yrs): regression, classification, clustering, random forest, Scikit-learn, gradient boosting (XGBoost, LightGBM)• DL (5 yrs): ANN, interpretable neural networks, image classification, object detection (Faster R-CNN), CNN (EfficientNet, EfficientNetV2), NLP (Siamese BERT), PyTorch, Keras• Graph machine learning: node classification, link prediction, random walk, PageRank, DeepWalk, node2vec, graph neural network (GraphSAGE, Graph Attention Network, R-GCN), Deep Graph Library (DGL)• Distributed computing (batch jobs; MapReduce), Docker, knowledge of Cassandra & Postgres, Pip/Conda, GitKaggle: https://www.kaggle.com/lin4mationGitHub: https://github.com/Jack-Lin-DS-AIMedium: https://medium.com/@jacklindsai• Optimize PyTorch Performance for Speed and Memory Efficiency (2022) (10k+ views in a week): https://towardsdatascience.com/optimize-pytorch-performance-for-speed-and-memory-efficiency-2022-84f453916ea6• Self-Supervised Learning (SSL) Overview
Jack Chih-Hsu Lin, Phd's current company
Company context helps verify the profile and gives searchers a useful next step.
Jack Chih-Hsu Lin, Phd work experience
A career timeline built from the work history available for this profile.
Technical Writer
Current- Towards Data Science is one of the largest data science publications (650K followers).
- Mastering GenAI ML System Design Interview: Principles & Solution Outline (2024): https://towardsdatascience.com/mastering-genai-ml-system-design-interview-principles-solution-outline-71a4664511a7
- Mastering GenAI ML System Design Interview (2): Design ChatGPT Memory Feature (2024): https://medium.com/towards-data-science/mastering-genai-ml-system-design-interview-2-design-chatgpt-memory-feature-fe908517d76c
- Scaling Monosemanticity: Anthropic’s One Step Towards Interpretable & Manipulable LLMs (2024):https://medium.com/towards-data-science/scaling-monosemanticity-anthropics-one-step-towards-interpretable-manipulable-llms-4b.
- Optimize PyTorch Performance for Speed and Memory Efficiency (2022) (10k+ views in a week): https://towardsdatascience.com/optimize-pytorch-performance-for-speed-and-memory-efficiency-2022-84f453916ea6
- Self-Supervised Learning (SSL) Overview
: https://towardsdatascience.com/self-supervised-learning-ssl-overview-8a7f24740e40
Lead Data Scientist, Generative Ai
Current- Develop/deploy agentic RAG framework (reflection, memory) for querying structured data via natural languages
- Pre-train and fine-tune Code Llama, Llama 3, StarCoder2 with LoRA on multi-GPUs for text-to-database queries
- Develop/deploy agentic automatic synthetic data generation pipeline, boosting model performance by 15%
- Align the LLMs with human preference by chain of hindsight to reduce hallucination
Senior Data Scientist, Applied Machine Learning
- Write weekly reports to executives, work with product manager/designers, engineers and explain DS approach to customers/stakeholders for the project
- Developed multi-task neural networks for hierarchical forecasting of business deals and company revenue
- Developed large-scale deep learning model trained with a graph of 3M nodes and 33M edges
- Built/Deployed the algorithm pipeline (reducing cost by 500000x) on 100 workers to determine spatial temporal correlation among 9.8B time series geospatial data points and generated a heterogenous knowledge graph.
- Improved the recall of diagram parsing (computer vision, object detection) by 20-80% using Faster R-CNN in Keras
- Analyzed time series geospatial data of 800M records by distributed computation (batch jobs and MapReduce)
Postdoctoral Associate - Data Scientist
I invent and implement a new type of neural network that reduces 200 times of parameters and converges 35% faster in PyTorch. I achieve AUROC 0.88 while predicting therapeutic target genes by modeling high dimensional (~10,000) data in human cancer cell lines. I analyze and validate the prediction by statistical tests.
Graduate Research Assistant
I merged and cleaned data from 3 databases and generated a network of 215,000+ drug-gene-disease associations. I implemented and validated graph-based diffusion in Python to predict associations which were proven to be true either in later database releases or literatures with >90% precision. The results were published as a 1st-author paper in.
President / Board Member
I led a team of 30+ persons from 9 institutes/colleges in 4 Texas cities to facilitate intellectual conversation and networking among Taiwanese professionals in bioscience field worldwide. I organized the annual symposium in 2019, and the participant number increased by 27%.
Invited Speaker
2019/10/02 Gave a talk titled "One of the Best Ways to Explore Your Career: the Internship, a Thrilling Journey to Illumina iAspire" at Texas Taiwanese Biotechnology Association Webinar, Houston, Texas. 2019/08/04 Gave a talk titled "Texas: the Third Coast or the Third World of Biotech?" at Boston Taiwanese Biotechnology Association 2019 Annual Symposium.
Competition Participant
I am overall ranked top 3.0% (5,484/179,945). I have won top 3.0% (26/866), top 4.9% (188/3,835), and top 6.1% (201/3,274) in 3 competitions (in image and tabular data) using data science, machine learning and deep learning skills: data cleaning, missing data imputation, exploratory data analysis, statistical analysis, feature selection/engineering.
Bioinformatics Intern, Clinical Genomics Research Dept.
I analyzed clinical sequencing data to diagnose patients with rare and undiagnosed genetic diseases. I decreased 94% of time spent in manual annotation of colleagues by developing a new machine learning pipeline. I found unexpected, creative and meaningful features to improve predictions. I collaborated with multiple teams to customize the pipeline to.
Team Lead, Business Case Challenge
I organized meetings and led a 11-intern cross-functional team to win the 1st place out of 10 teams in the business case competition. We developed strategies and solutions for a challenge Illumina was facing by using everyone's expertise. The team consisted of people from diverse departments e.g., marketing, financing, engineering, computational biology.
Research Assistant In Institute Of Statistical Science
I processed hundreds of gigabytes of 5 types of cancer genomics data using Perl and R.
Research Assistant In Biodiversity Research Center
I analyzed RNA-seq data of 70 million pair-ended reads from 12 yeast samples using Perl and R
Colleagues at Google
Other employees you can reach at towardsdatascience.com. View company contacts for 479 employees →
Dan Vo
Colleague at Google
Binh Dinh, Vietnam, Viet Nam
View →
SL
Stephen Lanier
Colleague at Google
Seattle, Washington, United States, United States
View →
EM
Empl Month
Colleague at Google
Wappingers Falls, New York, United States, United States
View →
GD
Gianni Dinovi
Colleague at Google
Amsterdam, North Holland, Netherlands, Netherlands
View →
LS
Luiz Scheuer
Colleague at Google
São Paulo, Brazil, Brazil
View →
TV
Tokala Venkatesh
Colleague at Google
Hyderabad, Telangana, India, India
View →
BK
Balakrishna Kumar V
Colleague at Google
Coimbatore, Tamil Nadu, India, India
View →
HU
Hahkihns Uajah
Colleague at Google
Bangladesh, Bangladesh
View →
VC
Valerie Carey
Colleague at Google
Rochester, New York, United States, United States
View →
RP
Ryan Pégoud
Colleague at Google
London, England, United Kingdom, United Kingdom
View →
Jack Chih-Hsu Lin, Phd education
Doctor Of Philosophy (Phd), Quantitative Computational Biology, Gpa 4.0
Master Of Science (Ms), Plant Pathology And Micriobiology, Gpa 4.0
Bachelor'S Degree, Plant Pathology And Microbiology
Frequently asked questions about Jack Chih-Hsu Lin, Phd
Quick answers generated from the profile data available on this page.
What company does Jack Chih-Hsu Lin, Phd work for?
Jack Chih-Hsu Lin, Phd works for Google.
What is Jack Chih-Hsu Lin, Phd's role at Google?
Jack Chih-Hsu Lin, Phd is listed as Software Engineer at Google.
Where is Jack Chih-Hsu Lin, Phd based?
Jack Chih-Hsu Lin, Phd is based in San Francisco Bay Area, United States, United States while working with Google.
What companies has Jack Chih-Hsu Lin, Phd worked for?
Jack Chih-Hsu Lin, Phd has worked for Google, Towards Data Science, C3 Ai, Baylor College Of Medicine, and Texas Taiwanese Biotechnology Association (Ttba).
Who are Jack Chih-Hsu Lin, Phd's colleagues at Google?
Jack Chih-Hsu Lin, Phd's colleagues at Google include Dan Vo, Stephen Lanier, Empl Month, Gianni Dinovi, and Luiz Scheuer.
How can I contact Jack Chih-Hsu Lin, Phd?
You can use AeroLeads to view verified contact signals for Jack Chih-Hsu Lin, Phd at Google, including work email, phone, and LinkedIn data when available.
What schools did Jack Chih-Hsu Lin, Phd attend?
Jack Chih-Hsu Lin, Phd holds Doctor Of Philosophy (Phd), Quantitative Computational Biology, Gpa 4.0 from Baylor College Of Medicine.
Search by job title, company, industry, location, and seniority. Export verified B2B contact data when you need it.
Start free trial