news

Jun 15, 2026 I’m very happy to join Google as a student researcher for summer 2026.
Jun 09, 2026 See our new paper A Unifying Lens on SFT Through Target Distribution Design.
Jun 06, 2026 We presented our paper Understanding Reward Hacking in Text-to-Image Reinforcement Learning at CVPR 2026, which uncovers how different rewards lead to exploitations in T2I RL and mitigation methods; code is available on Github.
Apr 30, 2026 Our paper When Distance Distracts: Representation Distance Bias in BT-Loss for Reward Models is accepted by ICML 2026.
Sep 18, 2025 Our paper on boosting fine-grained zero-shot performance of MLLMs with unlabeled data has been accepted at NeurIPS 2025.