Xiong-Hui Chen

Researcher at Qwen Team.

Ph.D. in LAMDA Group, Nanjing University
Supervisor: Prof. Yang Yu
Email: xiong-hui.chenn [at] outlook.com, chenxh [at] lamda.nju.edu.cn, xionghui.cxh@alibaba-inc.com

The Qwen Team is hiring! We are looking for researchers and engineers who are passionate about developing (1) Next-Gen Robotics Foundation Model; (2) More efficient RL algorithms/infra. If you are interested in exploring this opportunity with us, please email xionghui.cxh [at] alibaba-inc.com.

[ Google scholar ] [ DBLP ] [ Research gate ] [ Github ] [ Twitter ] [ Zhihu ] [ LinkedIn ] [ Code Space of LAMDA-RL Group ]

Currently, I am a Researcher at the Qwen Team. I obtained my Ph.D. degree from Nanjing University, under the supervision of Prof. Yang Yu, and as a member of the LAMDA Group, led by Prof. Zhi-Hua Zhou. Before my doctoral studies, I received my B.Sc. degree in Software Engineering from Southeast University in 2018. In September 2018, I was admitted to pursue an M.Sc. degree at Nanjing University under the supervision of Prof. Yang Yu. I continued my research as a Ph.D. student from September 2020. From October 2023, I have been a visiting researcher in the UK, working with Prof. Yali Du at King’s College London and Prof. Jun Wang at University College London.

Research Interests:
My research focuses on addressing challenges in applying Reinforcement Learning (RL) to real-world problems. In particular, I am interested in sim-to-real transfer, offline RL, causal inference for RL, and real-world environment reconstruction. I also work on developing RL policies for applications such as autonomous driving, recommender systems, robotics, and industrial control systems. More recently, my research has expanded to large-scale foundation model training.

News

Sep 28, 2025 Our Paper is accepted by NeurIPS 2024! [ 20/80 Rules in LLMs].
Jan 24, 2025 One Paper is accepted by ICLR 2025 Oral! (AFlow: Automating Agentic Workflow Generation)
Dec 24, 2024 I will host a seminar about “embodied AI for robotics” at 集智俱乐部, welcome to join us [link].
Dec 21, 2024 Invited talk about “基于强化学习的可泛化大模型策略求解方法的近期进展” at Amazon Cloud, Beijing.
Sep 29, 2024 Invited talk about “Recent Studies on Causal Reinforcement Learning” at 集智俱乐部. [ slides ].
Sep 28, 2024 Two Papers are accepted by NeurIPS 2024! [ Policy Learning from Books (Oral), Knowledgeable Agents from Language Model Rollouts (Poster)]

Selected publications

* indicates equal contribution
  1. ArXiv
    Group Sequence Policy Optimization
    [ Link ]
    Chujie Zheng, Shixuan Liu, Mingze Li, Xiong-Hui Chen , Bowen Yu, Chang Gao, Kai Dang, Yuqiong Liu, Rui Men, An Yang, Jingren Zhou, and Junyang Lin.
    In ArXiv. 2025.
  1. NeurIPS
    Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting (Oral)
    [ Link Website Zhihu ]
    Xiong-Hui Chen , Ziyan Wang, Yali Du, Shengyi Jiang, Meng Fang, Yang Yu, and Jun Wang.
    In Advances in Neural Information Processing Systems 37. 2024.
  2. ICML
    Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration
    [ Link Code Website ]
    Xiong-Hui Chen , Junyin Ye, Hang Zhao, Yi-Chen Li, Xu-Hui Liu, Haoran Shi, Yu-Yan Xu, Zhihao Ye, Si-Hang Yang, Anqi Huang, Kai Xu, Zongzhang Zhang, and Yang Yu.
    In The 41st International Conference on Machine Learning. 2024.
  3. NeurIPS
    Adversarial Counterfactual Environment Model Learning (Spotlight)
    [ Link Code ]
    Xiong-Hui Chen , Yang Yu, Zheng-Mao Zhu, Zhihua Yu, Zhenjun Chen, Chenghe Wang, Yinan Wu, Hongqiu Wu, Rong-Jun Qin, Ruijin Ding, and Fangsheng Huang.
    In Advances in Neural Information Processing Systems 36. 2023.
  4. NeurIPS
    Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning
    [ Link Code ]
    Xiong-Hui Chen , Shengyi Jiang, Feng Xu, Zongzhang Zhang, and Yang Yu.
    In Advances in Neural Information Processing Systems 34. 2021.
  5. NeurIPS
    Offline Model-based Adaptable Policy Learning
    [ Link Code ]
    Xiong-Hui Chen , Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei (Tony) Qin, Wenjie Shang, and Jieping Ye.
    In Advances in Neural Information Processing Systems 34. 2021.


Correspondence


Laboratory: Computer Science Building, Xianlin Campus of Nanjing University

Address: Xiong-Hui Chen, National Key Laboratory for Novel Software Technology, Nanjing University, Xianlin Campus Mailbox 603, 163 Xianlin Avenue, Qixia District, Nanjing 210023, China.
南京市栖霞区仙林大道163号, 南京大学仙林校区603信箱, 软件新技术国家重点实验室, 210023.