* indicates equal contributioncon.
Preprints
-
A Survey on Model-based Reinforcement Learning
[
Link
]
Fan-Ming Luo,
Tian Xu,
Hang Lai,
Xiong-Hui Chen
,
Weinan Zhang,
and Yang Yu.
In ArXiv.
2022.
-
Offline Reinforcement Learning with Causal Structured World Models
[
Link
]
Zheng-Mao Zhu,
Xiong-Hui Chen
,
Hong-Long Tian,
Kun Zhang,
and Yang Yu.
In ArXiv.
2022.
Manuscripts
2023
-
Offline Model-Based Adaptable Policy Learning for Decision-Making in Out-of-Support Regions
[
Link
Code
Appendix
]
Xiong-Hui Chen
,
Fan-Ming Luo,
Yang Yu,
Qingyang Li,
Zhiwei Qin,
Wenjie Shang,
and Jieping Ye.
In IEEE Transactions on Pattern Analysis and Machine Intelligence.
2023.
Papers
2024
-
KALM: Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
[
Link
Website
Zhihu
]
Jing-Cheng Pang,
Si-Hang Yang,
Kaiyuan Li,
Jiaji Zhang,
Xiong-Hui Chen
,
Nan Tang,
and Yang Yu.
In Advances in Neural Information Processing Systems 37.
2024.
-
Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting
(Oral)
[
Link
Website
Zhihu
]
Xiong-Hui Chen
,
Ziyan Wang,
Yali Du,
Shengyi Jiang,
Meng Fang,
Yang Yu,
and Jun Wang.
In Advances in Neural Information Processing Systems 37.
2024.
-
Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration
[
Link
Code
Website
]
Xiong-Hui Chen
,
Junyin Ye,
Hang Zhao,
Yi-Chen Li,
Xu-Hui Liu,
Haoran Shi,
Yu-Yan Xu,
Zhihao Ye,
Si-Hang Yang,
Anqi Huang,
Kai Xu,
Zongzhang Zhang,
and Yang Yu.
In The 41st International Conference on Machine Learning.
2024.
-
Ruifeng Chen,
Xiong-Hui Chen*
,
Yi-Hao Sun,
Siyuan Xiao,
Minhui Li,
and Yang Yu.
In The 41st International Conference on Machine Learning.
2024.
-
Policy Rehearsing: Training Generalizable Policies for Reinforcement Learning
[
Link
]
Chengxing Jia,
Chenxiao Gao,
Hao Yin,
Fuxiang Zhang,
Xiong-Hui Chen
,
Tian Xu,
Lei Yuan,
Zongzhang Zhang,
Yang Yu,
and Zhi-Hua Zhou.
In The 12th International Conference on Learning Representations.
2024.
-
Language Model Self-improvement by Reinforcement Learning Contemplation
[
Link
]
Jing-Cheng Pang,
Pengyuan Wang,
Kaiyuan Li,
Xiong-Hui Chen
,
Jiacheng Xu,
Zongzhang Zhang,
and Yang Yu.
In The 12th International Conference on Learning Representations.
2024.
2023
-
Adversarial Counterfactual Environment Model Learning
(Spotlight)
[
Link
Code
]
Xiong-Hui Chen
,
Yang Yu,
Zheng-Mao Zhu,
Zhihua Yu,
Zhenjun Chen,
Chenghe Wang,
Yinan Wu,
Hongqiu Wu,
Rong-Jun Qin,
Ruijin Ding,
and Fangsheng Huang.
In Advances in Neural Information Processing Systems 36.
2023.
-
Natural Language Instruction-following with Task-related Language Development and Translation
[
Link
Code
]
Jing-Cheng Pang,
Xinyu Yang,
Si-Hang Yang,
Xiong-Hui Chen
,
and Yang Yu.
In Advances in Neural Information Processing Systems 36.
2023.
-
Object-Oriented Option Framework for Robotics Manipulation in Clutter
[
Link
Code
]
Pang Jing-Cheng,
Young Stalin,
Xiong-Hui Chen
,
Xinyu Yang,
Yang Yu,
Mas Ma,
Ziqi Guo,
Howard Yang,
and Bill Huang.
In 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems.
2023.
-
Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-world Long-term User Engagement in Sequential Recommender Systems
[
Link
Code
]
Xiong-Hui Chen
,
Bowei He,
Yang Yu,
Qingyang Li,
Zhiwei (Tony) Qin,
Wenjie Shang,
Jieping Ye,
and Chen Ma.
In Proceedings of the 39th IEEE International Conference on Data Engineering.
2023.
2022
-
NeoRL: A Near Real-world Benchmark for Offline Reinforcement Learning
[
Link
Code
Zhihu
]
Rong-Jun Qin,
Xingyuan Zhang,
Songyi Gao,
Xiong-Hui Chen
,
Zewen Li,
Weinan Zhang,
and Yang Yu.
In Advances in Neural Information Processing Systems 35 Datasets and Benchmarks Track.
2022.
-
A Simulator-based Decision-Making Approach to Sequential Recommender Systems with Application in Ride-hailing Platform
[
Link
Code
]
Xiong-Hui Chen
,
Yang Yu,
Qingyang Li,
Bowei He,
Zhiwei (Tony) Qin,
Wenjie Shang,
and Jieping Ye.
In the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Workshop on Decision Intelligence and Analytics for Online Marketplaces.
2022.
2021
-
Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement
Learning
[
Link
Code
]
Xiong-Hui Chen
,
Shengyi Jiang,
Feng Xu,
Zongzhang Zhang,
and Yang Yu.
In Advances in Neural Information Processing Systems 34.
2021.
-
Offline Model-based Adaptable Policy Learning
[
Link
Code
]
Xiong-Hui Chen
,
Yang Yu,
Qingyang Li,
Fan-Ming Luo,
Zhiwei (Tony) Qin,
Wenjie Shang,
and Jieping Ye.
In Advances in Neural Information Processing Systems 34.
2021.
2020
-
Efficient Exploration by Novelty-Pursuit
[
Link
Code
]
Ziniu Li,
and
Xiong-Hui Chen*
.
In Proceedings of the 2nd International Conference on Distributed Artificial Intelligence.
2020.
2019
-
Reinforcement Learning with Derivative-Free Exploration
[
Link
Code
]
Xiong-Hui Chen
,
and Yang Yu.
In Proceedings of the 18th International Conference on Autonomous Agents
and MultiAgent Systems.
2019.