Qianzhong Chen’s Website

I’m a first year PhD student of Stanford Aero-Astro Department, advised by Dr. Mac Schwager. Previous to that, I was a research assistant at UIUC-ACRL, advised by Dr. Naira Hovakimyan and Dr. Sheng Cheng.

I also spent time at xdof.ai, Unitree, Centrillion working as robotics engineering/research intern. I was fortunate to be advised by Philipp Wu (xdof.ai), Fred Shentu (xdof.ai).

My interests and background are across robot VLA model, world model, progress-based reward model, and autonomous drones visual navigation.

I received my Bachelor’s degree in Mechanical Engineering from both Zhejiang University and UIUC in 2023. I received my Master’s degree in Mechanical Engineering from Stanford University in 2025.

My Resume can be found here (updated Dec. 2025).

Feel free to contact me via email (qchen23 {at} stanford.edu), linkedin or WeChat: CQZ_David.

Selected Publications (Full list)

sarm
[Accepted to ICLR 2026] SARM: Stage-Aware Reward Modeling for Long Horizon Robot Manipulation
Q. Chen, J. Yu, M. Schwager, P. Abbeel, F. Shentu, P. Wu
arXiv | website | LeRobot | code

TL;DR: SARM is a stage-aware, video-based reward modeling framework that enables scalable and robust imitation learning for long-horizon tasks by deriving progress signals from natural language annotations, dramatically improving policy performance over standard behavior cloning.
ParticleFormer
[CoRL 2025] ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation
S. Huang, Q. Chen, X. Zhang, J. Sun, M. Schwager
arXiv | website | code

TL;DR: A state-of-the-art 3D world model trained directly from point clouds, which enables accurate dynamics prediction across multi-object, multi-material scenarios and empowers model-based visuomotor control in robotic manipulation tasks.
DroneVLA
[RA-L 2025] GRaD-Nav++: Vision-Language Model Enabled Visual Drone Navigation with Gaussian Radiance Fields and Differentiable Dynamics
Q. Chen, N. Gao, S. Huang, J. Low, T. Chen, J. Sun, M. Schwager
arXiv | website | code

TL;DR: GRaD-Nav++ is a lightweight, fully onboard Vision-Language-Action framework that enables drones to follow natural language commands in real time using DiffRL training in a 3DGS simulator, achieving strong generalization across tasks and environments both in simulation and on real hardware.

Recent news

  • 2026/01: 🎉🎉 Our new paper SARM on Robot Manipulation Reward Modeling has been accepted to ICLR 2026!
  • 2026/01: 🚀🚀 SARM is now natively supported in LeRobot! Thanks huggingface🤗!
  • 2025/11: 🎉🎉 Our new paper GRaD-Nav++ on drone VLA has been accepted to RA-L 2025!
  • 2025/08: 🎉🎉 Our new paper ARCH on RL for manipulations has been accepted to CoRL 2025!
  • 2025/08: 🎉🎉 Our new paper ParticleFormer on 3D world model for manipulations has been accepted to CoRL 2025!
  • 2025/04: ✨✨ I was admitted to the Aeronautics and Astronautics Department, Stanford University as a PhD student, supervised by Dr. Mac Schwager. <!– * 2025/06: Our new paper GRaD-Nav on drone end-to-end visual navigation has been accepted to IROS 2025!
  • 2025/06: Our new paper DiffTune-HECTOR on auto-tuning bipedal robots MPC controllers has been accepted to IROS 2025! –>

Honors and awards

  • Stanford Aero-Astro PhD Fellowship (2025)
  • Outstanding Undergraduate Thesis Award, Department of Mechanical Engineering, Zhejiang University (2023)
  • First Class Academic Scholarship of ZJU-UIUC Institute (Top 1%) (2022)

Service

  • Journal Articles Reviewer: IEEE RA-L (2025), IEEE IoT (2025)
  • Conference Reviewer: IROS (2025), ICRA (2026)
  • Member of the IEEE Robotics and Automation Society