Applied Scientist Intern
Current- Developed benchmark and evaluation pipeline to measure robustness to questionrephrasing in customer facing LLM → identified critical inconsistencies- Investigated empirical relation between task complexity, network capacity and attainable network robustness → trained 10,000+ networks