Selected Preprints
"On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding", arXiv:2505.12723 (2025).
"ToTRL: Unlock LLM Tree-of-Thoughts Reasoning Potential through Puzzles Solving", arXiv:2505.12717 (2025).
"Architect of the Bits World: Masked Autoregressive Modeling for Circuit Generation Guided by Truth Table", arXiv:2502.12751 (2025).