Kai Zhang Email and Phone Number
Kai Zhang work email
- Valid
Kai Zhang personal email
- Valid
Kai Zhang phone numbers
Specialties: Cloud Computing, Kubernetes, Docker, Cloud Native AI, MLOps, Distributed system management, job orchestration and scheduling, Deep learning, Tensorflow, Messaging System, QoS (High Availability, Scalability and Reliability) of Distributed Services, Services Oriented Architecture, Search Engine, Semantic WebLatest interests: Cloud native, Kubernetes, Deep Learning, Cloud native AI, MLOps
Alibaba Cloud
View- Website:
- aliyun.com
- Employees:
- 2157
-
Senior Staff EngineerAlibaba Cloud Jan 2017 - PresentBeijing City, ChinaWorking on Alibaba Cloud Container Services for Kubernetes (a.k.a, ACK), leading heterogeneous computing and AI solution based on cloud native technology. We deliver large scale AI platform through Docker and Kubernetes for WW customers.- Lead ACK Kubernetes core scheduler product development. Support Batch scheduling policies (including Gang, Capacity, Fair share), Job queue, Elastic scheduling, Colocation scheduling of hybrid workloads, Resource/Task topology aware scheduling, Fine grained CPU/GPU resource Control/Sharing/Isolation, etc. ACK scheduling performance is 5-10 times over the open-source scheduler. - Established ACK Cloud-native AI Suite products, with a business growth rate in the triple digits for three consecutive years. Offering unified management, monitoring, and scheduling for GPU/NPU/RDMA, support for popular open-source AI frameworks (TF, Pytorch, Deepspeed, Megatron, Horovod, Spark, MPI, Triton, Kserve, etc) and models (CV/NLP/LLM/AIGC), AI task orchestration and scheduling, elastic training and fault tolerance, serverless model inference, dataset management and acceleration, MLOps lifecycle management, integration with Huggingface, Modelscope, AI container images repository, and other ecosystems, all the way to model development and platform operation tools and SDKs, providing full-stack support for AI system engineering efficiency. Manage over ten thousand GPU cards for clients, increase GPU utilization by 100%, accelerate AI tasks by 30%, and optimize AI engineering efficiency by 50%- Lead team to create and contribute to multiple cloud-native open-source projects, including Kube-scheduler (K8s scheduler), Koordinator (Colocation scheduler), Kube-queue/Kueue (K8s job queue), Fluid (Dataset orchestration and acceleration, CNCF sandbox), Kubeflow (Machine learning), Kserve (AI model inference). Promoted the cloud-native community to treat AI/big data workloads as first-class citizens and provided more native support. -
Technical DirectorMegvii (Face++) - 旷视科技 Feb 2016 - Nov 2016Beijing City, ChinaMegvii as the leading AI technology startup in China, establishes very solid competency in deep learning and computer vision area. It provides the most popular intelligent services like Face++, FaceID. Besides, we build up the comprehensive cloud platform to produce AI models and services more efficiently. It’s a deep learning domain PaaS, designed to help different roles related to AI, like researchers, consultants and application developers.- As technical director and lead architect, I cooperate with CTO and other senior leaders to define the product and roadmap.- Leads core team (~15 top gun devs) to design/develop/operate the whole platform, to take the challenge of integrating cloud with AI. . The platform gives end to end support to deep learning, from data collect, labeling, preprocess dataset, data flow, build neural network, train/manage models, to encapsulate model into executables for various platforms, even publish them as APIs. . Monitoring, logging, training experiment management services are supported to smoothen deep learning iterations. . Solid cluster management (like Mesos, Kubernetes), job scheduling and orchestration are supported to ensure high efficiency, HA and scalability throughout training lifecycle. The cluster manages 100+ servers providing heterogeneous resources like CPU, GPU, ethernet, infiniband and distributed storage. . All jobs are isolated by Docker with volume/overlay network/containers lifecycle control automatically. . On the top, it provides major services to accelerate deep learning, like model train monitoring (like Google’s TensorBoard), model checkpoint/restore, service registry/discovery, admin console and CLI. . The most popular deep learning frameworks are supported like Google’s Tensorflow neutrally, as well as our private train engine. - As technical lead, I also take responsibility to help the young elite engineer team to quickly grow up with both professional engineering and technical insight.
-
Senior Software ArchitectNetdragon Websoft Inc.(网龙网络公司) May 2015 - Feb 2016Beijing City, China- As leader of technical team (~60 staffs), be responsible for technical strategy and decision, product development and operation of NetDragon’s PaaS cloud platform for online education business. The platform now supports dozens of applications (including K12 education , IM, social, ERP, etc) serving both internal users and external customers.- Lead architecture design and technology verification of core cloud services, like content store, database, user management, security, application monitoring, scaling, profiling, logging, etc. Focus on HA, scalability, reliability and QoS, as well as DevOps landing in product team.- Lead deployment the platform on 3rd party IaaS cloud like AWS, as well as technical challenge of hybrid cloud deployment.- Be responsible for several K12 key applications product design and development. -
Senior Software EngineerIbm Dec 2013 - May 2015Beijing City, ChinaWorking on Platform as a Service (PaaS) layer of Cloud computing, focusing on cloud application QoS and Resiliency. Care about application monitoring, logging, auto scaling, dynamic configuration, and HA. Hands on experiences of Pivotal's Cloud Foundry and IBM's BlueMix (PaaS) cloud platform. Leading Bluemix Fabric operation and development works in China lab. -
Advisory Software EngineerIbm Jul 2011 - Nov 2013China, BeijingI'm working on the public cloud project, and focusing on PaaS (platform as a service), SaaS (software as a service), and BPaaS (business process as a service). Meanwhile, I'm also taking some studying on mobile applications as well as IoT (internet of things). -
Staff Software EngineerIbm Jul 2009 - Jun 2011I'm a staff software engineer of WebSphere in IBM China at Beijing(CN). I have over 3 years experience in every aspect of application integration software, from development, test and assurance to planning, with plenty of skills of business integration, enterprise connectivity solution, SOA and related technologies. I am also interested in Web 2.0, Smentic Web and Information Retrieval. -
Software EngineerPivotal Labs Apr 2014 - May 2014San Francisco Bay AreaI am participating Pivotal Dojo program onsite of San Francisco office and working on development of Cloud Foundry which is popular Platform-as-a-Service cloud offering. My responsibility is focusing on Cloud Foundry runtime development, Pull Request verification and merge, new feature design, etc. -
Staff Software EngineerIbm China Development Lab 2006 - 2010
Kai Zhang Skills
Kai Zhang Education Details
-
Computer Science And Technology -
Lan Hua San Zhong -
Lan Hua San Zhong -
Lan Hua San Zhong
Frequently Asked Questions about Kai Zhang
What company does Kai Zhang work for?
Kai Zhang works for Alibaba Cloud
What is Kai Zhang's role at the current company?
Kai Zhang's current role is Senior Staff Engineer at Alibaba Cloud.
What is Kai Zhang's email address?
Kai Zhang's email address is ws****@****ail.com
What is Kai Zhang's direct phone number?
Kai Zhang's direct phone number is (800) 426*****
What schools did Kai Zhang attend?
Kai Zhang attended Beijing Institute Of Technology, Lan Hua San Zhong, Lan Hua San Zhong, Lan Hua San Zhong.
What skills is Kai Zhang known for?
Kai Zhang has skills like Cloud Computing, Soa, Saas, Mobile Applications, Java Enterprise Edition, Xml, Enterprise Software, Design Patterns, Integration, High Availability, Unix, Distributed Systems.
Who are Kai Zhang's colleagues?
Kai Zhang's colleagues are Wayne Shi, Kell Xiong, Lei Shi, 王志国, Leaf Ye, 高凌霄, 贾少天.
Not the Kai Zhang you were looking for?
-
3americanexpress.com, msu.edu, stt.msu.edu
1 (212) 6XXXXXXX
-
1gmail.com
Free Chrome Extension
Find emails, phones & company data instantly
Download 750 million emails and 100 million phone numbers
Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.
Start your free trial