1. Distributed parallel machine learning algorithms have been coded include: 分布式Deep Learning(DNN)(Data parallelism & Model parallelism),最大熵(Max Entropy),逻辑回归(LR),基于梯度的决策树(LambdaMART / GBDT),随机森林(RF),PageRank,协同过滤(Collaborative Filtering),SimHash,MinHash,Disjoint Set,统计词对齐(SMT),主题模型(PLSA / semi-PLSA / online-PLSA),LDA,划分聚类(Kmeans / 快速Kmeans /流式Kmeans),层次聚类(HAC),Query Clustering,EMBT, 矩阵分解(ALS),Learning with Local and Global Consistency(LLGC),分布式TF-IDF,分布式AUC评估,word2vec的并行化;2. Large-scale Data Mining and Machine Learning Algorithms for Internet Advertisement, internet search, NLP, Image understanding, speech recognition, etc.3. Distributed Parallel Computing Architecture Framework;4. Deep Learning for speech recognition;5. Semi-supervised Learning, MRF, CRF, Belief Propagation, Manifold Learning, Data Modeling and Analysis, etc. for Internet Advertisement, internet search;6. Stereo matching algorithm, Image Understanding based on Texture, Context and Semantic Structure, etc. for Image understanding;