Reinforcement Learning on Pre-Training Data

Sep 24, 2025·
Siheng Li
,
Kejiao Li
,
Zenan Xu
,
Guanhua Huang
,
Evander Yang
,
Kun Li
Haoyuan Wu
Haoyuan Wu
,
Jiajia Wu
,
Zihao Zheng
,
Chenchen Zhang
,
Kun Shi
,
Kyrierl Deng
,
Qi Yi
,
Ruibin Xiong
,
Tingqiang Xu
,
Yuhao Jiang
,
Jianfeng Yan
,
Yuyuan Zeng
,
Guanghui Xu
,
Jinbao Xue
,
Zhijiang Xu
,
Zheng Fang
,
Shuai Li
,
Qibin Liu
,
Xiaoxue Li
,
Zhuoyu Li
,
Yangyu Tao
,
Fei Gao
,
Cheng Jiang
,
Bo Chao Wang
,
Kai Liu
,
Jianchen Zhu
,
Wai Lam
,
Bo Zhou
,
Di Wang
· 0 min read
Type
Publication
arXiv:2509.19249 (2025), (Technical Report)