Yaxin Du
PhD Student
Shanghai Jiao Tong University
I am a fourth-year direct-entry Ph.D. student at Shanghai Jiao Tong University, advised by Prof. Siheng Chen. I am currently exploring job opportunities and would be happy to connect. Outside research, I enjoy meeting new people, food, travel, and discovering new things.
Research Overview
My recent work centers on how language-model-based agents can operate reliably in realistic environments, improve through training, and be evaluated on open-ended tasks.
- Tool Use and Environment Interaction. I study how agents interact with tools and external environments, with recent work on MCP-based tool learning, web browsing, and environment-grounded agent benchmarks.
- Training Agents and Coding Models. I work on training strategies for agentic and coding-oriented language models, including role-wise multi-agent training, coder-verifier co-evolution, and mid-training for coding LLMs.
- Evaluation in Open-Ended Tasks. I build benchmarks and data-centric resources for studying how agents and coding models behave on realistic, feedback-driven tasks.
News
- Jun 2026 Released HyperTool: Beyond Step-Wise Tool Calls for Tool-Augmented Agents.
- Jun 2026 Released MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection.
- May 2026 Released DataMaster: Data-Centric Autonomous AI Research.
- May 2026 Three papers were accepted to ICML 2026: G2-Reader, MCP-Persona, and NTK-Selector.
- Apr 2026 Two papers accepted to ACL 2026: MCP-Flow and VLMGuard-R1.
- Apr 2026 Released the InCoder paper series. [Paper]
- Feb 2026 Released G²-Reader.
- Jan 2026 Joined iQuest Lab, Ubiquant, as a Research Intern, working on foundation model training and coding LLM mid-training.
- Jan 2026 InfoMosaicBench was accepted to ICLR 2026.
- Dec 2025 Presented SWE-Dev at the NeurIPS 2025 Workshop.
- Sep 2025 Joined TikTok AI Innovation Center as a Research Intern, mentored by Qian Liu.
Publications
Selected Publications
-
arXiv 2026 HyperTool: Beyond Step-Wise Tool Calls for Tool-Augmented Agents
arXiv preprint arXiv:2606.13663 [Paper] -
arXiv 2026 DataMaster: Data-Centric Autonomous AI Research
[Paper] -
ICLR 2026 InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents
arXiv preprint arXiv:2510.02271 [Paper] -
ICML 2026 G2-Reader: Dual Evolving Graphs for Multimodal Document QA
[Paper] -
arXiv 2025 SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development
NeurIPS 2025 DL4C Workshop [Paper] [Code] -
ACL 2025 Findings FedDQC: Data Quality Control in Federated Instruction-tuning of Large Language Models
[Paper] -
ICLR 2024 Workshop Enhancing Data Quality in Federated Fine-tuning of Large Language Models
[Paper] -
ICML 2026 NTK-Selector: Selecting Auxiliary Data via Neural Tangent Kernels for Low-Resource Domains
[Paper] -
ICLR 2025 Self-evolving Multi-agent Collaboration Networks for Software Development
-
arXiv 2025 MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems
[Paper] -
KDD 2024 OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning
[Paper] -
NeurIPS 2024 FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models
[Paper] -
ICLR 2024 Fake It Till Make It: Federated Learning with Consensus-Oriented Generation
[Paper]
Other Publications
-
arXiv 2026 MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection
[Paper] -
arXiv 2026 EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale
[Paper] -
arXiv 2025 VLMGuard-R1: Proactive Safety Alignment for VLMs via Reasoning-Driven Prompt Optimization
[Paper] -
arXiv 2025 BrowseMaster: Towards Scalable Web Browsing via Tool-Augmented Programmatic Agent Pair
[Paper] -
ACL 2026 MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools
[Paper] -
arXiv 2025 Federated Instruction Tuning of LLMs with Domain Coverage Augmentation
-
IET RSN 2024 Radar-based Human Activity Recognition Using Denoising Techniques to Enhance Classification Accuracy
-
arXiv 2026 InCoder-32B: Code Foundation Model for Industrial Scenarios
[Paper] -
arXiv 2026 InCoder-32B-Thinking: Industrial Code World Model for Thinking
[Paper] -
RadarConf 2022 A ViT Approach for Short-range Behaviour Recognition Using Radar Signals
-
CIE Radar 2021 Radar-based Human Activity Classification with Cyclostationarity
Research Experience
-
Jan 2026 - Present Research Intern, iQuest Lab, UbiquantWorking on foundation model training, with a recent focus on mid-training for coding LLMs.
-
Sep 2025 - Jan 2026 Research Intern, TikTok AI Innovation CenterMentored by Qian Liu; worked on LLM agents, coding-related language model research, and related systems problems.
Blog
Training Multi-Agent Systems: From MAS Generators to Co-Evolving Agents
This post summarizes how I think about training multi-agent systems, including MAS generation, component-level training, and co-evolution. I also discuss why training MAS is fundamentally different from prompt engineering alone, and where the main open problems still are.
Read the post