Yaxin Du

Yaxin Du

PhD Student

Shanghai Jiao Tong University

I am a Ph.D. student at Shanghai Jiao Tong University, advised by Prof. Siheng Chen. My research interests include LLM agents, coding LLMs, and federated learning. Recently, I have been working on agentic systems, coding-oriented language models, and related benchmarks and resources for large model research.

I will attend ICLR 2026 and would be happy to connect about research opportunities and collaborations.

News

  • Apr 2026 Attending ICLR 2026 and presenting InfoMosaicBench; happy to connect in person.
  • Apr 2026 Two papers accepted to ACL 2026: MCP-Flow and VLMGuard-R1.
  • Apr 2026 Released the InCoder paper series. [Paper]
  • Feb 2026 Released G²-Reader.
  • Jan 2026 Joined iQuest Lab, Ubiquant, as a Research Intern, working on foundation model training and coding LLM mid-training.
  • Jan 2026 InfoMosaicBench was accepted to ICLR 2026.
  • Dec 2025 Presented SWE-Dev at the NeurIPS 2025 Workshop.
  • Sep 2025 Joined TikTok AI Innovation Center as a Research Intern, mentored by Qian Liu.

Publications

Selected Publications

  • ICLR 2026 InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents
    Yaxin Du, Yuanshuo Zhang, Xiyuan Yang, Yifan Zhou, Cheng Wang, Gongyi Zou, Xianghe Pang, Wenhao Wang, et al.
    arXiv preprint arXiv:2510.02271 [Paper]
  • arXiv 2026 G2-Reader: Dual Evolving Graphs for Multimodal Document QA
    Yaxin Du*, Junru Song*, Yifan Zhou*, Cheng Wang, Jiahao Gu, Zimeng Chen, Menglan Chen, Wen Yao, Yang Yang, Ying Wen, Siheng Chen [Paper]
  • arXiv 2025 SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development
    Yaxin Du*, Yuzhu Cai*, Yifan Zhou, Cheng Wang, Yu Qian, Xianghe Pang, Qian Liu, Yue Hu, Siheng Chen
    NeurIPS 2025 DL4C Workshop [Paper] [Code]
  • ACL 2025 Findings FedDQC: Data Quality Control in Federated Instruction-tuning of Large Language Models
    Yaxin Du, Rui Ye, Fengting Yuchi, Wanru Zhao, Jingjing Qu, Yanfeng Wang, Siheng Chen [Paper]
  • ICLR 2024 Workshop Enhancing Data Quality in Federated Fine-tuning of Large Language Models
    Wanru Zhao*, Yaxin Du*, Nicholas Donald Lane, Siheng Chen, Yanfeng Wang [Paper]
  • ICLR 2025 Self-evolving Multi-agent Collaboration Networks for Software Development
    Yue Hu, Yuzhu Cai, Yaxin Du, Xinyu Zhu, Xiangrui Liu, Zijie Yu, Yuchen Hou, Shuo Tang, Siheng Chen
  • arXiv 2025 MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems
    Rui Ye, Shuo Tang, Rui Ge, Yaxin Du, Zhenfei Yin, Siheng Chen, Jing Shao [Paper]
  • KDD 2024 OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning
    Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin Du, Yanfeng Wang, Siheng Chen [Paper]
  • NeurIPS 2024 FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models
    Rui Ye, Rui Ge, Xinyu Zhu, Jingyi Chai, Yaxin Du, Yang Liu, Yanfeng Wang, Siheng Chen [Paper]
  • ICLR 2024 Fake It Till Make It: Federated Learning with Consensus-Oriented Generation
    Rui Ye, Yaxin Du, Zhenyang Ni, Siheng Chen, Yanfeng Wang [Paper]
Other Publications
  • arXiv 2025 VLMGuard-R1: Proactive Safety Alignment for VLMs via Reasoning-Driven Prompt Optimization
    M. Chen, X. Pang, J. Dong, W. H. Wang, Yaxin Du, S. Chen [Paper]
  • arXiv 2025 BrowseMaster: Towards Scalable Web Browsing via Tool-Augmented Programmatic Agent Pair
    X. Pang, S. Tang, R. Ye, Yaxin Du, Yaxin Du, S. Chen [Paper]
  • ACL 2026 MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools
    W. Wang, P. Niu, Z. Xu, Z. Chen, J. Du, Yaxin Du, X. Pang, K. Huang, Y. Wang, et al. [Paper]
  • arXiv 2025 Federated Instruction Tuning of LLMs with Domain Coverage Augmentation
    Z. Wang, Yaxin Du, Z. Qian, S. Chen
  • arXiv 2025 Selecting Auxiliary Data via Neural Tangent Kernels for Low-Resource Domains
    P. Wang, H. Liu, Y. Liao, Z. Fan, Yaxin Du, S. Tang, Y. Wang, Y. Wang [Paper]
  • IET RSN 2024 Radar-based Human Activity Recognition Using Denoising Techniques to Enhance Classification Accuracy
    R. Yu, Yaxin Du, J. Li, A. Napolitano, J. Le Kernec
  • arXiv 2026 InCoder-32B: Code Foundation Model for Industrial Scenarios
    J. Yang, W. Zhang, J. Wu, J. Cheng, S. Guo, H. Wang, W. Gu, Yaxin Du, J. Li, F. Xu, et al. [Paper]
  • arXiv 2026 InCoder-32B-Thinking: Industrial Code World Model for Thinking
    J. Yang, W. Zhang, J. Wu, J. Cheng, T. Zheng, F. Xu, W. Gu, L. Jing, Yaxin Du, J. Li, et al. [Paper]
  • RadarConf 2022 A ViT Approach for Short-range Behaviour Recognition Using Radar Signals
    Yaxin Du, Bingliang Li, Jipeng Li, Francesco Fioranelli, Julien Le Kernec
  • CIE Radar 2021 Radar-based Human Activity Classification with Cyclostationarity
    Yaxin Du, Jipeng Li, Zhouyixian Li, Ran Yu, Antonio Napolitano, Francesco Fioranelli, Julien Le Kernec

Research Experience

  • Jan 2026 - Present Research Intern, iQuest Lab, Ubiquant
    Working on foundation model training, with a recent focus on mid-training for coding LLMs.
  • Sep 2025 - Jan 2026 Research Intern, TikTok AI Innovation Center
    Mentored by Qian Liu; worked on LLM agents, coding-related language model research, and related systems problems.