• Home
  • Publication
  • Preprint
  • Experience
  • Selected Preprints
    • On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding
    • ToTRL: Unlock LLM Tree-of-Thoughts Reasoning Potential through Puzzles Solving
    • Architect of the Bits World: Masked Autoregressive Modeling for Circuit Generation Guided by Truth Table
  • Selected Publications
    • Efficient OpAmp Adaptation for Zoom Attention to Golden Contexts
    • Divergent Thoughts toward One Goal: LLM-based Multi-Agent Collaboration System for Electronic Design Automation
    • Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
    • Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
    • ChatEDA: A Large Language Model Powered Autonomous Agent for EDA
    • p-Laplacian Adaptation for Generative Pre-trained Vision-Language Models
  • Experience

On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding

May 19, 2025·
Haoyuan Wu
Haoyuan Wu
,
Rui Ming
,
Jilong Gao
,
Hangyu Zhao
,
Xueyi Chen
,
Yikai Yang
,
Haisheng Zheng
,
Zhuolun He
,
Bei Yu
· 0 min read
Paper
Last updated on May 19, 2025
Large Language Models
Haoyuan Wu
Authors
Haoyuan Wu
Ph.D. Student

ToTRL: Unlock LLM Tree-of-Thoughts Reasoning Potential through Puzzles Solving May 19, 2025 →