Xiong-Hui Chen

Joint Postdoctoral Researcher at Peking University and Tongyi Qianwen Lab (Qwen Team).

Ph.D. in LAMDA Group, Nanjing University
Supervisor: Prof. Yang Yu
Email: xiong-hui.chenn [at] outlook.com, chenxh [at] lamda.nju.edu.cn, xionghui.cxh@alibaba-inc.com

The Qwen Team is hiring! We are looking for researchers and engineers who are passionate about developing (1) Next-Gen Robotics Foundation Model; (2) More efficient RL algorithms/infra. If you are interested in exploring this opportunity with us, please email xionghui.cxh [at] alibaba-inc.com.

[ Google scholar ] [ DBLP ] [ Research gate ] [ Github ] [ Twitter ] [ Zhihu ] [ LinkedIn ] [ Code Space of LAMDA-RL Group ]

Currently, I am a joint postdoctoral researcher at Peking University and Tongyi Qianwen Lab (Qwen Team), working on embodied foundation models and reinforcement learning for robotics. I obtained my Ph.D. degree from Nanjing University, under the supervision of Prof. Yang Yu, and as a member of the LAMDA Group, led by Prof. Zhi-Hua Zhou. Before my doctoral studies, I received my B.Sc. degree in Software Engineering from Southeast University in 2018. In September 2018, I was admitted to pursue an M.Sc. degree at Nanjing University under the supervision of Prof. Yang Yu. I continued my research as a Ph.D. student from September 2020. From October 2023, I have been a visiting researcher in the UK, working with Prof. Yali Du at King’s College London and Prof. Jun Wang at University College London.

Research Interests:
My research focuses on addressing challenges in applying Reinforcement Learning (RL) to real-world problems. In particular, I am interested in sim-to-real transfer, offline RL, causal inference for RL, and real-world environment reconstruction. I also work on developing RL policies for applications such as autonomous driving, recommender systems, robotics, and industrial control systems. More recently, my research has expanded to large-scale foundation model training.

News

Jul 4, 2026 Our Qwen-Robot Suite is now public, introducing a family of embodied foundation models for robotic manipulation, navigation, and world modeling.
Sep 28, 2025 Our Paper is accepted by NeurIPS 2024! [ 20/80 Rules in LLMs].
Jan 24, 2025 One Paper is accepted by ICLR 2025 Oral! (AFlow: Automating Agentic Workflow Generation)
Dec 24, 2024 I will host a seminar about “embodied AI for robotics” at 集智俱乐部, welcome to join us [link].
Dec 21, 2024 Invited talk about “基于强化学习的可泛化大模型策略求解方法的近期进展” at Amazon Cloud, Beijing.
Sep 29, 2024 Invited talk about “Recent Studies on Causal Reinforcement Learning” at 集智俱乐部. [ slides ].

Selected publications

* indicates equal contribution; † indicates corresponding author; § indicates project lead.
  1. ArXiv
    Qwen-RobotManip Technical Report: Alignment Unlocks Scale for Robotic Manipulation Foundation Models
    [ Link Code Website ]
    Haoqi Yuan, Zhixuan Liang, Anzhe Chen, Ye Wang, Haoyang Li, Pei Lin, Yiyang Huang, Zixing Lei, Tong Zhang, Jiazhao Zhang, Jie Zhang, Jingyang Fan, Gengze Zhou, Qihang Peng, Chenxu Lv, Xiaoyue Chen, An Yang, Fei Huang, Junyang Lin, Dayiheng Liu, Jingren Zhou, Chenfei Wu, and Xiong-Hui Chen§ .
    In ArXiv. 2026.
  2. ArXiv
    Qwen-RobotNav Technical Report: A Scalable Navigation Model Designed for an Agentic Navigation System
    [ Link Code Website ]
    Jiazhao Zhang, Gengze Zhou, Hale Yin, Yiyang Huang, Zixing Lei, Qihang Peng, Haoqi Yuan, Jie Zhang, Xudong Guo, Xiaoyue Chen, An Yang, Fei Huang, Zhibo Yang, Junyang Lin, Dayiheng Liu, Jingren Zhou, Zhuoyuan Yu, Jingyang Fan, Zhixuan Liang, Pei Lin, Ye Wang, Haoyang Li, Anzhe Chen, Kun Yan, Xiao Xu, Jiahao Li, Lulu Hu, Minying Zhang, Shurui Li, Wenhu Xiao, Shuai Bai, Xuancheng Ren, Chenxu Lv, Chenfei Wu, and Xiong-Hui Chen§ .
    In ArXiv. 2026.
  3. ArXiv
    Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation
    [ Link Website ]
    Jie Zhang, Xiaoyue Chen, Anzhe Chen, Dayiheng Liu, Deqing Li, Gengze Zhou, Hale Yin, Haoqi Yuan, Haoyang Li, Jiahao Li, Jiazhao Zhang, Jingren Zhou, Kaiyuan Gao, Kun Yan, Lihan Jiang, Ningyuan Tang, Pei Lin, Qihang Peng, Shengming Yin, Tianhe Wu, Tianyi Yan, Xiao Xu, Yan Shu, Yanran Zhang, Ye Wang, Yi Wang, Yilei Chen, Yixian Xu, Yiyang Huang, Yuxiang Chen, Zekai Zhang, Zhendong Wang, Zixing Lei, Zhixuan Liang, Zihao Liu, Zikai Zhou, Chenxu Lv, Xiong-Hui Chen, and Chenfei Wu.
    In ArXiv. 2026.
  4. ArXiv
    Group Sequence Policy Optimization
    [ Link ]
    Chujie Zheng, Shixuan Liu, Mingze Li, Xiong-Hui Chen, Bowen Yu, Chang Gao, Kai Dang, Yuqiong Liu, Rui Men, An Yang, Jingren Zhou, and Junyang Lin.
    In ArXiv. 2025.
  1. NeurIPS
    Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting (Oral)
    [ Link Website Zhihu ]
    Xiong-Hui Chen, Ziyan Wang, Yali Du, Shengyi Jiang, Meng Fang, Yang Yu, and Jun Wang.
    In Advances in Neural Information Processing Systems 37. 2024.
  2. ICML
    Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration
    [ Link Code Website ]
    Xiong-Hui Chen, Junyin Ye, Hang Zhao, Yi-Chen Li, Xu-Hui Liu, Haoran Shi, Yu-Yan Xu, Zhihao Ye, Si-Hang Yang, Anqi Huang, Kai Xu, Zongzhang Zhang, and Yang Yu.
    In The 41st International Conference on Machine Learning. 2024.
  3. NeurIPS
    Adversarial Counterfactual Environment Model Learning (Spotlight)
    [ Link Code ]
    Xiong-Hui Chen, Yang Yu, Zheng-Mao Zhu, Zhihua Yu, Zhenjun Chen, Chenghe Wang, Yinan Wu, Hongqiu Wu, Rong-Jun Qin, Ruijin Ding, and Fangsheng Huang.
    In Advances in Neural Information Processing Systems 36. 2023.
  4. NeurIPS
    Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning
    [ Link Code ]
    Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, and Yang Yu.
    In Advances in Neural Information Processing Systems 34. 2021.
  5. NeurIPS
    Offline Model-based Adaptable Policy Learning
    [ Link Code ]
    Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei (Tony) Qin, Wenjie Shang, and Jieping Ye.
    In Advances in Neural Information Processing Systems 34. 2021.


Correspondence


Laboratory: Computer Science Building, Xianlin Campus of Nanjing University

Address: Xiong-Hui Chen, National Key Laboratory for Novel Software Technology, Nanjing University, Xianlin Campus Mailbox 603, 163 Xianlin Avenue, Qixia District, Nanjing 210023, China.
南京市栖霞区仙林大道163号, 南京大学仙林校区603信箱, 软件新技术国家重点实验室, 210023.