Hadi Khalaf

Google Scholar · Github · Twitter · Spotify Playlist · CV

Hi, I'm Hadi, a second year Computer Science PhD student at Harvard. My work develops the theoretical and applied foundations of reliable AI systems. I am currently interning at Microsoft, working on evals for agentic systems.

I am very fortunate to be advised by Prof. Flavio Calmon. I am supported by the Harvard Prize Fellowship and the Kempner Institute Graduate Fellowship.

Before my PhD, I was a research intern at the Harvard Economics Department where I studied algorithmic game theory. I also worked in machine learning for urban planning at the Beirut Urban Lab and in nuclear engineering at MIT as a Research Science Institute fellow. I completed my B.S. in Statistics and B.E. in Computer Engineering at the American University of Beirut.

I am always happy to chat about research or grad school on hadikhalaf [at] g dot harvard dot edu.

Selected News

Show All

06/26

I was awarded the Kempner Institute Graduate Fellowship! This will support my remaining years at Harvard.

05/26

Our paper on robust AI evaluation was accepted to ICML! See you in Seoul :)

05/26

Our paper on robust AI evaluation was accepted to ICML 2026! I also presented this work at the ICLR Algorithmic Fairness Workshop in Rio.

04/26

I passed my quals! I'd love to thank my advisor Flavio and the committee members Boaz Barak, Ariel Procaccia, and Milind Tambe.

02/26

I will be interning at Microsoft this summer, working on agentic systems!

09/25

Extremely happy to share that our work on reward hacking in large language models was accepted to NeurIPS 2025 as a Spotlight Paper! I am also thankful for the NeurIPS Scholar Award.

04/25

Our paper on discretion in AI alignment was accepted to ACM FAccT 2025!

03/25

I am at Yale, giving a talk on discretion in AI alignment. Happy to share that this work got the Best Paper Award at the New England NLP workshop! You can check my slides here.

09/24

I joined Harvard as a PhD student in Flavio Calmon's group! Happy to be supported by the Harvard Prize Fellowship.

Publications

Robust AI Evaluation through Maximal Lotteries

ICML 2026

HK, Serena Wang, Daniel Halpern, Itai Shapira, Flavio Calmon, Ariel Procaccia

TLDRWe introduce robust lotteries to aggregate heterogeneous preferences into reliable model evaluations.

[arxiv][code]

Inference-Time Reward Hacking in Large Language Models

NeurIPS 2025Spotlight Paper

HK, Claudio Mayrink Verdun, Alex Oesterling, Himabindu Lakkaraju, Flavio Calmon

TLDRWe propose hedging as a lightweight and theoretically grounded strategy to mitigate reward hacking.

[arxiv][code]

AI Alignment at Your Discretion

ACM FAccT 2025Best Paper Award at New England NLP Workshop

Maarten Buyl*, HK*, Claudio Mayrink Verdun*, Lucas Monteiro Paes*, Caio Vieira Machado, Flavio Calmon

TLDRWe risk deploying unsafe AI systems if we ignore their discretion in applying alignment objectives.

[arxiv][code]

* indicates equal contribution

Projects

SafetyConflicts Dataset

[link]

TLDRWe generate realistic user prompts that cause conflicts and tradeoffs between OpenAI's model specs.