On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding
May 19, 2025·
,
,
,
,
,
,
,
,
·
0 min read

Haoyuan Wu
Rui Ming
Jilong Gao
Hangyu Zhao
Xueyi Chen
Yikai Yang
Haisheng Zheng
Zhuolun He
Bei Yu
Type
Publication
The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)