Partner Group Engineering Manager (Ai Platform)
Current- Lead the engineering organization that builds the Microsoft Inference Cloud, one of the largest inference platform that delivers SaaS inferencing with large language models (LLMs) on multi-billion dollar GPU capacity.
- Inference cloud provides LLM APIs for all Microsoft Copilots, Azure OpenAI, and ISV hosted models like Llama, Mistral, Cohere, and others, with a unified and consistent experience for customers and developers.
- Drove the vision, strategy, and execution of Inference Cloud delivering inferencing as a SaaS offering, collaborating with internal and external stakeholders, product managers, and architects.
- Delivered 36% cost efficiency for LLM inferencing within a short time by optimizing resource utilization, reducing fragmentation, and scheduling workloads.