Ethan Perez
Alignment researcher at Anthropic. Author or co-author of multiple foundational papers on LLM behavioral measurement, including the first at-scale measurement of sycophancy and inverse-scaling behaviors, and contributor to the Kadavath self-knowledge study and the Sharma sycophancy paper.
Wiki Sources
- Perez et al. — Discovering LM Behaviors with Model-Written Evaluations (2022) — first author
- Sharma et al. — Sycophancy (2023) — co-author
- Kadavath et al. — Know What They Know (2022) — co-author