About Me

I am a fourth-year direct-entry Ph.D. student in Computer Science at Shanghai Jiao Tong University, advised by Prof. Siheng Chen. I am currently looking for research and engineering opportunities, especially around foundation models and agent systems.

Outside research, I enjoy meeting new people, food, travel, and all kinds of new experiences. If you are also working on agents, coding models, or open-ended evaluation, I would be happy to connect.

Research Topics

My recent work is organized around a few connected directions.

LLM agents coding LLM federated learning

Tool-Using Agents

I study how language-model agents interact with tools and external environments, with recent work on MCP-based tool learning, web browsing, and environment-grounded agent benchmarks.

Training Coding LLMs and MAS

I work on training strategies for agentic and coding-oriented language models, including role-wise multi-agent training, coder-verifier co-evolution, and mid-training for coding LLMs.

Open-Ended Evaluation

I build benchmarks and data-centric resources for studying how agents and coding models behave on realistic, feedback-driven tasks such as software engineering.

News

Publications

A few papers that best represent my recent work on agent systems, coding models, and open-ended evaluation.

Selected Publications

Full Publication List

Recent and Selected

  • arXiv 2026 MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection
    Haowen Wang*, Yaxin Du*, Jian Yang, Jiajun Wu, Shukai Liu, Yuanshuo Zhang, Pingjie Wang, Siheng Chen, Tuney Zheng, Ming Zhou, Xianglong Liu, Bryan Dai
    arXiv preprint [Paper]
  • arXiv 2026 LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling
    Jian Yang, Shawn Guo, Wei Zhang, Tianyu Zheng, Yaxin Du, Haau-Sing Li, Jiajun Wu, Yue Song, Yan Xing, Qingsong Cai, Zelong Huang, Chuan Hao, Ran Tao, Xianglong Liu, Wayne Xin Zhao, Mingjie Tang, Weifeng Lv, Ming Zhou, Bryan Dai
    arXiv preprint [Paper]
  • arXiv 2026 HyperTool: Beyond Step-Wise Tool Calls for Tool-Augmented Agents
    Yaxin Du*, Yifan Zhou*, Yujie Ge, Jiajun Wang, Xianghe Pang, Shuo Tang, Tuney Zheng, Bryan Dai, Jian Yang, Siheng Chen
    arXiv preprint [Paper]
  • arXiv 2026 DataMaster: Data-Centric Autonomous AI Research
    Yaxin Du*, Xiyuan Yang*, Zhifan Zhou, Wanxu Liu, Zixing Lei, Zimeng Chen, Fenyi Liu, Haotian Wu, Yuzhu Cai, Zexi Liu, Xinyu Zhu, Wenhao Wang, Linfeng Zhang, Chen Qian, Siheng Chen
    arXiv preprint [Paper]
  • ICML 2026 G2-Reader: Dual Evolving Graphs for Multimodal Document QA
    Yaxin Du*, Junru Song*, Yifan Zhou*, Cheng Wang, Jiahao Gu, Zimeng Chen, Menglan Chen, Wen Yao, Yang Yang, Ying Wen, Siheng Chen [Paper]
  • ICML 2026 NTK-Selector: Selecting Auxiliary Data via Neural Tangent Kernels for Low-Resource Domains
    Pingjie Wang, Hongcheng Liu, Yusheng Liao, Ziqing Fan, Yaxin Du, Shuo Tang, Yanfeng Wang, Yu Wang [Paper]
  • ICLR 2026 InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents
    Yaxin Du, Yuanshuo Zhang, Xiyuan Yang, Yifan Zhou, Cheng Wang, Gongyi Zou, Xianghe Pang, Zhiyu Li, Siheng Chen [Paper]
  • arXiv 2025 SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development
    Yaxin Du*, Yuzhu Cai*, Yifan Zhou, Cheng Wang, Yu Qian, Xianghe Pang, Qian Liu, Yue Hu, Siheng Chen
    NeurIPS 2025 DL4C Workshop [Paper] [Code]
  • ICLR 2025 Self-evolving Multi-agent Collaboration Networks for Software Development
    Yue Hu, Yuzhu Cai, Yaxin Du, Xinyu Zhu, Xiangrui Liu, Zijie Yu, Yuchen Hou, Shuo Tang, Siheng Chen
  • arXiv 2025 MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems
    Rui Ye, Shuo Tang, Rui Ge, Yaxin Du, Zhenfei Yin, Siheng Chen, Jing Shao [Paper]
  • ACL 2025 Findings FedDQC: Data Quality Control in Federated Instruction-tuning of Large Language Models
    Yaxin Du, Rui Ye, Fengting Yuchi, Wanru Zhao, Jingjing Qu, Yanfeng Wang, Siheng Chen [Paper]
  • KDD 2024 OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning
    Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin Du, Yanfeng Wang, Siheng Chen [Paper]
  • NeurIPS 2024 FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models
    Rui Ye, Rui Ge, Xinyu Zhu, Jingyi Chai, Yaxin Du, Yang Liu, Yanfeng Wang, Siheng Chen [Paper]
  • ICLR 2024 Workshop Enhancing Data Quality in Federated Fine-tuning of Large Language Models
    Wanru Zhao*, Yaxin Du*, Nicholas Donald Lane, Siheng Chen, Yanfeng Wang [Paper]
  • ICLR 2024 Fake It Till Make It: Federated Learning with Consensus-Oriented Generation
    Rui Ye, Yaxin Du, Zhenyang Ni, Siheng Chen, Yanfeng Wang [Paper]

Other Publications

  • arXiv 2026 EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale
    X. Zhu, Y. Cai, Z. Liu, C. Wang, F. Li, W. Jin, W. Liu, Z. Bing, B. Zheng, J. Chai, S. Tang, R. Ye, Y. Du, X. Pang, Y. Du, T. Miao, Y. Zhang, R. Liao, Z. Ding, L. Zhang, Y. Wang, W. E, S. Chen [Paper]
  • arXiv 2026 InCoder-32B-Thinking: Industrial Code World Model for Thinking
    J. Yang, W. Zhang, J. Wu, J. Cheng, T. Zheng, F. Xu, W. Gu, L. Jing, Y. Du, J. Li, et al. [Paper]
  • arXiv 2026 InCoder-32B: Code Foundation Model for Industrial Scenarios
    J. Yang, W. Zhang, J. Wu, J. Cheng, S. Guo, H. Wang, W. Gu, Y. Du, J. Li, F. Xu, et al. [Paper]
  • ACL 2026 MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools
    W. Wang, P. Niu, Z. Xu, Z. Chen, J. Du, Y. Du, X. Pang, K. Huang, Y. Wang, et al. [Paper]
  • arXiv 2025 BrowseMaster: Towards Scalable Web Browsing via Tool-Augmented Programmatic Agent Pair
    X. Pang, S. Tang, R. Ye, Y. Du, Y. Du, S. Chen [Paper]
  • arXiv 2025 VLMGuard-R1: Proactive Safety Alignment for VLMs via Reasoning-Driven Prompt Optimization
    M. Chen, X. Pang, J. Dong, W. H. Wang, Y. Du, S. Chen [Paper]
  • arXiv 2025 Federated Instruction Tuning of LLMs with Domain Coverage Augmentation
    Z. Wang, Y. Du, Z. Qian, S. Chen
  • IET RSN 2024 Radar-based Human Activity Recognition Using Denoising Techniques to Enhance Classification Accuracy
    R. Yu, Y. Du, J. Li, A. Napolitano, J. Le Kernec
  • RadarConf 2022 A ViT Approach for Short-range Behaviour Recognition Using Radar Signals
    Y. Du, B. Li, J. Li, F. Fioranelli, J. Le Kernec
  • CIE Radar 2021 Radar-based Human Activity Classification with Cyclostationarity
    Y. Du, J. Li, Z. Li, R. Yu, A. Napolitano, F. Fioranelli, J. Le Kernec

Technical Reports

Technical reports and model releases related to industrial coding models and test-time scaling.

Research Experience

  • Jan 2026 - Present Research Intern, iQuest Lab, Ubiquant
    Working on foundation model training, with a recent focus on mid-training for coding LLMs.
  • Sep 2025 - Jan 2026 Research Intern, TikTok AI Innovation Center
    Mentored by Qian Liu; worked on LLM agents, coding-related language model research, and related systems problems.

Blog

May 2026

Training Multi-Agent Systems: From MAS Generators to Co-Evolving Agents

This post summarizes how I think about training multi-agent systems, including MAS generation, component-level training, and co-evolution. I also discuss why training MAS is fundamentally different from prompt engineering alone, and where the main open problems still are.

Read the post