Research Interests
Technical AI safety: reward hacking, and pessimistic algorithms.
Publications
For a full list of my publications, please visit my Google Scholar profile.
Blog Posts
April 2025
Technical AI safety: reward hacking, and pessimistic algorithms.
For a full list of my publications, please visit my Google Scholar profile.