Machine Learning Operations Engineer
CurrentOptimizing streaming ASR production systems.Performance Engineering (GPU)OpenSource: https://github.com/k2-fsa/sherpa-onnxLLM deployment servers and inference enginesDeployment with kubernetesCUDA and Triton experiments for scalability