Researching AI Safety and Alignment through the lens of Statistical Physics.
About Me
I am a PhD researcher at the University of Nottingham, supervised by Dr Edward Gillman and Professor Juan Garrahan. My work focuses on AI Safety and Alignment, by applying techniques from Statistical Physics and Large Deviation Theory to estimate the likelihood of rare, unwanted generations from Large Language Models. My paper on this topic provides a rigorous framework for the safety evaluation of language models, and was awarded a spotlight position with an oral presentation at ICML 2026.
Research Interests
AI Safety
Large Deviation Theory
Rare Event Analysis
Large Language Models
Publications
Jake McAllister Dorman, Edward Gillman, Dominic C. Rose, Jamie F. Mair, Juan P. Garrahan, (2026).
Rare Event Analysis of Large Language Models. ICML 2026 Spotlight with Oral Presentation.
Developed an end-to-end framework for estimating the probability of rare, unwanted completions from LLMs, using techniques from Statistical physics.
Our methodology was able to achieve probability estimates using 108 times fewer samples than traditional direct sampling.
Education
PhD Physics and Machine Learning, University of Nottingham (2024 - 2028).
MSc Machine Learning in Science, University of Nottingham (2023 - 2024). Distinction (86%).
BSc Mathematics with Economics, University College London (2020 - 2023). 2:1 (68%).
Contact Me
if you have any questions about my research or wish to know more, email me at: jake.dorman@nottingham.ac.uk