• Home
  • Publication
  • Experience
  • Selected Publications
    • Diversity or Precision? A Deep Dive into Next Token Prediction
    • One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient
    • Reinforcement Learning on Pre-Training Data
    • Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
    • On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding
    • ToTRL: Unlock LLM Tree-of-Thoughts Reasoning Potential through Puzzles Solving
    • Efficient OpAmp Adaptation for Zoom Attention to Golden Contexts
    • Divergent Thoughts toward One Goal: LLM-based Multi-Agent Collaboration System for Electronic Design Automation
    • Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
    • ChatEDA: A Large Language Model Powered Autonomous Agent for EDA
    • p-Laplacian Adaptation for Generative Pre-trained Vision-Language Models
  • Experience

Diversity or Precision? A Deep Dive into Next Token Prediction

Dec 30, 2025·
Haoyuan Wu
Haoyuan Wu
,
Hai Wang
,
Jiajia Wu
,
Jinxiang Ou
,
Keyao Wang
,
Weile Chen
,
Zihao Zheng
,
Bei Yu
· 0 min read
Paper
Type
Conference paper
Publication
arXiv:2512.22955 (2025), (Hunyuan Technical Report)
Last updated on Dec 30, 2025
Large Language Models
Haoyuan Wu
Authors
Haoyuan Wu
Ph.D. Student

One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient Sep 30, 2025 →