Algorithm Engineer
Current+ Research and development on algorithms for model training / interpreting compression and accelerating.+ Analyze model accuracy and performance after porting to a new platform.+ LLM inference accelerating methods like static KV cache / quantization / tensor parallel / speculative decoding.+ [Keras cv attention.