Hi, I'm Hadi! I'm a second year Computer Science PhD student at Harvard. I am very fortunate to be advised by Prof. Flavio Calmon.
My research focuses on developing robust and scalable tools for AI post-training. I aim to re-engineer training and evaluation workflows to target complex capabilities directly without relying on noisy proxies. By grounding these workflows in theoretical rigor, my goal is to build AI systems that are capable of helping us address our most complex and important challenges.
Before starting my PhD, I was a research intern at the Economics Department at Harvard, where I worked on partial identification and algorithmic game theory. I also worked in machine learning for urban planning at the Beirut Urban Lab and in computational nuclear engineering at MIT as a Research Science Institute fellow.
I am always happy to chat about research on hadikhalaf [at] g dot harvard dot edu.

Selected News
Show All09/25
Extremely happy to share that our work on reward hacking🔍 in large language models was accepted to NeurIPS 2025 as a Spotlight Paper! I am also thankful for the NeurIPS Scholar Award.
07/25
I am at ICML, presenting our work on reward hacking🔍 at the Models of Human Feedback for AI Alignment Workshop. Grateful to have been awarded the Hudson River Trading travel grant.
04/25
Our paper on discretion🔍 in AI alignment was accepted to ACM FAccT 2025!
03/25
I am at Yale, giving a talk on discretion🔍 in AI alignment. Happy to share that this work got the Best Paper Award at the New England NLP workshop! You can check my slides here.
09/24
I joined Harvard as a PhD student in Flavio Calmon's group! Happy to be supported by the Harvard Graduate Prize Fellowship.
Publications
Neural Information Processing Systems (NeurIPS) 2025
Inference-Time Reward Hacking in Large Language Models
Spotlight Paper
HK, Claudio Mayrink Verdun, Alex Oesterling, Himabindu Lakkaraju, Flavio Calmon
TLDR We propose hedging as a lightweight and theoretically grounded strategy to mitigate reward hacking in inference-time alignment.
ACM Conference on Fairness, Accountability, and Transparency (FAccT) 2025
AI Alignment at Your Discretion
🏆 Best Paper Award at NENLP Workshop
Maarten Buyl, HK, Claudio Mayrink Verdun, Lucas Monteiro Paes, Caio C. Vieira Machado, Flavio Calmon
TLDR We risk deploying unsafe AI systems if we ignore their discretion in applying alignment objectives.
Projects
SafetyConflicts DatasetLink
TLDR We generate realistic user prompts that cause conflicts and tradeoffs between OpenAI's model specs. We also include reasoning traces and responses from three frontier reasoning models.