news
| Apr 30, 2026 | Our paper When Distance Distracts: Representation Distance Bias in BT-Loss for Reward Models is accepted by ICML 2026. |
|---|---|
| Jan 06, 2026 | Our new paper Understanding Reward Hacking in Text-to-Image Reinforcement Learning is out, uncovering how reward design leads to artifact exploitation in T2I RL; code is available on Github. |
| Sep 18, 2025 | Our paper on boosting fine-grained zero-shot performance of MLLMs with unlabeled data has been accepted at NeurIPS 2025. |