Publications

* indicates equal contributioncon.

Preprints

      1. ArXiv
        A Survey on Model-based Reinforcement Learning
        [ Link ]
        Fan-Ming Luo, Tian Xu, Hang Lai, Xiong-Hui Chen , Weinan Zhang, and Yang Yu.
        In ArXiv. 2022.
      2. ArXiv
        Offline Reinforcement Learning with Causal Structured World Models
        [ Link ]
        Zheng-Mao Zhu, Xiong-Hui Chen , Hong-Long Tian, Kun Zhang, and Yang Yu.
        In ArXiv. 2022.

            Manuscripts

            2023
            1. TPAMI
              Offline Model-Based Adaptable Policy Learning for Decision-Making in Out-of-Support Regions
              [ Link Code Appendix ]
              Xiong-Hui Chen , Fan-Ming Luo, Yang Yu, Qingyang Li, Zhiwei Qin, Wenjie Shang, and Jieping Ye.
              In IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023.

            Papers

            2024
            1. NeurIPS
              KALM: Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
              [ Link Website Zhihu ]
              Jing-Cheng Pang, Si-Hang Yang, Kaiyuan Li, Jiaji Zhang, Xiong-Hui Chen , Nan Tang, and Yang Yu.
              In Advances in Neural Information Processing Systems 37. 2024.
            2. NeurIPS
              Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting (Oral)
              [ Link Website Zhihu ]
              Xiong-Hui Chen , Ziyan Wang, Yali Du, Shengyi Jiang, Meng Fang, Yang Yu, and Jun Wang.
              In Advances in Neural Information Processing Systems 37. 2024.
            3. ICML
              Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration
              [ Link Code Website ]
              Xiong-Hui Chen , Junyin Ye, Hang Zhao, Yi-Chen Li, Xu-Hui Liu, Haoran Shi, Yu-Yan Xu, Zhihao Ye, Si-Hang Yang, Anqi Huang, Kai Xu, Zongzhang Zhang, and Yang Yu.
              In The 41st International Conference on Machine Learning. 2024.
            4. ICML
              Policy-conditioned Models are More Generalizable
              [ Link Code Website ]
              Ruifeng Chen, Xiong-Hui Chen* , Yi-Hao Sun, Siyuan Xiao, Minhui Li, and Yang Yu.
              In The 41st International Conference on Machine Learning. 2024.
            5. ICLR
              Policy Rehearsing: Training Generalizable Policies for Reinforcement Learning
              [ Link ]
              Chengxing Jia, Chenxiao Gao, Hao Yin, Fuxiang Zhang, Xiong-Hui Chen , Tian Xu, Lei Yuan, Zongzhang Zhang, Yang Yu, and Zhi-Hua Zhou.
              In The 12th International Conference on Learning Representations. 2024.
            6. ICLR
              Language Model Self-improvement by Reinforcement Learning Contemplation
              [ Link ]
              Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen , Jiacheng Xu, Zongzhang Zhang, and Yang Yu.
              In The 12th International Conference on Learning Representations. 2024.
            2023
            1. NeurIPS
              Adversarial Counterfactual Environment Model Learning (Spotlight)
              [ Link Code ]
              Xiong-Hui Chen , Yang Yu, Zheng-Mao Zhu, Zhihua Yu, Zhenjun Chen, Chenghe Wang, Yinan Wu, Hongqiu Wu, Rong-Jun Qin, Ruijin Ding, and Fangsheng Huang.
              In Advances in Neural Information Processing Systems 36. 2023.
            2. NeurIPS
              Natural Language Instruction-following with Task-related Language Development and Translation
              [ Link Code ]
              Jing-Cheng Pang, Xinyu Yang, Si-Hang Yang, Xiong-Hui Chen , and Yang Yu.
              In Advances in Neural Information Processing Systems 36. 2023.
            3. IROS
              Object-Oriented Option Framework for Robotics Manipulation in Clutter
              [ Link Code ]
              Pang Jing-Cheng, Young Stalin, Xiong-Hui Chen , Xinyu Yang, Yang Yu, Mas Ma, Ziqi Guo, Howard Yang, and Bill Huang.
              In 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2023.
            4. ICDE
              Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-world Long-term User Engagement in Sequential Recommender Systems
              [ Link Code ]
              Xiong-Hui Chen , Bowei He, Yang Yu, Qingyang Li, Zhiwei (Tony) Qin, Wenjie Shang, Jieping Ye, and Chen Ma.
              In Proceedings of the 39th IEEE International Conference on Data Engineering. 2023.
            2022
            1. NeurIPS
              NeoRL: A Near Real-world Benchmark for Offline Reinforcement Learning
              [ Link Code Zhihu ]
              Rong-Jun Qin, Xingyuan Zhang, Songyi Gao, Xiong-Hui Chen , Zewen Li, Weinan Zhang, and Yang Yu.
              In Advances in Neural Information Processing Systems 35 Datasets and Benchmarks Track. 2022.
            2. KDD
              A Simulator-based Decision-Making Approach to Sequential Recommender Systems with Application in Ride-hailing Platform
              [ Link Code ]
              Xiong-Hui Chen , Yang Yu, Qingyang Li, Bowei He, Zhiwei (Tony) Qin, Wenjie Shang, and Jieping Ye.
              In the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Workshop on Decision Intelligence and Analytics for Online Marketplaces. 2022.
            2021
            1. NeurIPS
              Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning
              [ Link Code ]
              Xiong-Hui Chen , Shengyi Jiang, Feng Xu, Zongzhang Zhang, and Yang Yu.
              In Advances in Neural Information Processing Systems 34. 2021.
            2. NeurIPS
              Offline Model-based Adaptable Policy Learning
              [ Link Code ]
              Xiong-Hui Chen , Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei (Tony) Qin, Wenjie Shang, and Jieping Ye.
              In Advances in Neural Information Processing Systems 34. 2021.
            2020
            1. DAI
              Efficient Exploration by Novelty-Pursuit
              [ Link Code ]
              Ziniu Li, and Xiong-Hui Chen* .
              In Proceedings of the 2nd International Conference on Distributed Artificial Intelligence. 2020.
            2019
            1. AAMAS
              Reinforcement Learning with Derivative-Free Exploration
              [ Link Code ]
              Xiong-Hui Chen , and Yang Yu.
              In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. 2019.