Hey, I’m Yi Ma, working at Shanxi University as an associate professor. I received my PhD degree at June 2024 in College of Intelligence and Computing of Tianjin University. I used to be supervised by Professor Jianye Hao in his research group. I have published more than 20 papers in top AI conferences. Besides, I’m a huge fan of basketball, snowboarding and orienteering.
I’m interested in:
- Reinforcement Learning
- Offline Reinforcement Learning
- Embodied AI
- Application of Deep Reinforcement Learning
If you’re interested in these domains, please send me your CV.
🎓 Education
- 2020.09 - 2024.06, Tianjin University, PhD
- 2018.09 - 2020.06, Tianjin University, Master
- 2014.09 - 2018.06, Tianjin University, Bachelor
📝 Selected Publications
Papers
PS: Authors with equal contribution are marked by *.
Yi Ma, Jianye Hao, Xiaohan Hu, Yan Zheng, Chenjun Xiao.
Iteratively Refined Behavior Regularization for Offline Reinforcement Learning.
NeurIPS 2024. (CCF A
)
[link]
Jiashun Liu, Jianye Hao, Xiaotian Hao, Yi Ma, Yan Zheng, Yujing Hu, Tangjie Lv.
Unlock the Intermittent Control Ability of Model Free Reinforcement Learning.
NeurIPS 2024. (CCF A
)
[link]
Zibin Dong, Yifu Yuan, Jianye Hao, Fei Ni, Yi Ma, Pengyi Li, Yan Zheng.
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making.
NeurIPS 2024 Datasets and Benchmarks Track. (CCF A
)
[link]
Yi Ma, Jianye Hao, Hebin Liang, Chenjun Xiao.
Rethinking Decision Transformer via Hierarchical Reinforcement Learning.
ICML 2024. (CCF A
)
[link]
Jiashun Liu, Jianye HAO, Yi Ma, Shuyin Xia.
Imagine Big from Small: Unlock the Cognitive Generalization of Deep Reinforcement Learning from Simple Scenarios.
ICML 2024. (CCF A
)
[link]
Kai Zhao, Jianye Hao, Yi Ma, Jinyi Liu, Yan Zheng, Zhaopeng Meng.
ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles.
IJCAI 2024. (CCF A
)
[link]
Yifu Yuan, Jianye Hao, Yi Ma, Zibin Dong, Hebin Liang, Jinyi Liu, Zhixin Feng, Kai Zhao, Yan Zheng.
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback.
ICLR 2024. (Top AI Conference
)
[link]
Yi Ma, Hongyao Tang, Dong Li, Zhaopeng Meng.
Reining Generalization in Offline Reinforcement Learning via Representation Distinction.
NeurIPS 2023. (CCF A
)
[link]
Hebin Liang, Zibin Dong, Yi Ma, Xiaotian Hao, Yan Zheng, Jianye Hao.
A Hierarchical Imitation Learning-based Decision Framework for Autonomous Driving.
CIKM 2023. (CCF B
)
[link]
Hebin Liang* , Yi Ma*, Zilin Cao, Tianyang Liu, Fei Ni, Zhigang Li, Jianye Hao.
SplitNet: A Reinforcement Learning based Sequence Splitting Method for the MinMax Multiple Travelling Salesman Problem.
AAAI 2023. (CCF A
)
[link]
Yi Ma, Chao Wang, Chen Chen, Jinyi Liu, Zhaopeng Meng, Yan Zheng, Jianye Hao.
OSCAR: OOD State Conservative Offline Reinforcement Learning for S equential Decision Making.
CAAI Aritificial Intelligence Research 2023.
[link]
Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng, Boyan Li, Zhen Wang.
PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations.
IJCAI 2022. (CCF A
)
[link]
Yi Ma, Xiaotian Hao*, Jianye Hao, Jiawen Lu, Xing Liu, Tong Xialiang, Mingxuan Yuan, Zhigang Li, Jie Tang, Zhaopeng Meng.
A hierarchical reinforcement learning based optimization framework for large scale dynamic pickup and delivery problems.
NeurIPS 2021. (CCF A
)
[link]
Fei Ni, Jianye Hao, Jiawen Lu, Xialiang Tong, Mingxuan Yuan, Jiahui Duan, Yi Ma, Kun He
A Multi-Graph Attributed Reinforcement Learning based Optimization Algorithm for Large-scale Hybrid Flow Shop Scheduling Problem.
KDD 2021. (CCF A
)
[link]
Xiaotian Hao*, Zhaoqing Peng*, Yi Ma*, Guan Wang, Junqi Jin, Jianye Hao, Shan Chen, Rongquan Bai, Mingzhou Xie, Miao Xu, Zhenzhe Zheng, Chuan Yu, Han Li, Jian Xu, Kun Gai.
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising.
ICML 2020. (CCF A
)
[link]
Patents
- CN113850414B. Method for Logistics Scheduling Planning Based on Graph Neural Networks and Reinforcement Learning. (First Inventor, Authorized)
- CN114130034B. Multi Agent Game AI Design Method Based on Attention Mechanism and Reinforcement Learning. (Fifth Inventor, Authorized)
- CN113947348A. A Method and Device for Order Allocation. (Second Inventor, Under Review)
- CN113869489A. Complex Game AI Design Method Based on Hierarchical Deep Reinforcement Learning. (Third Inventor, Authorized)
- CN113869488A. Reinforcement Learning Method for Game AI Agents in Continuous Discrete Mixed Decision Environments. (Third Inventor, Authorized)
- CN114169421A. Cooperative Exploration Method in Sparse Reward Environments for Multi Agent Systems Based on Intrinsic Motivation. (Fourth Inventor, Under Review)
- CN114139681A. Meta Reinforcement Learning Method Based on Contrastive Learning and Mutual Information. (Fourth Inventor, Under Review)
🏅 Competitions and Honors
- 2022.12, NeurIPS 2022 SMARTS Autonomous Driving Competition.
First Prize
in both tracks. [link] - 2021.06, Huawei 2012 Central Research Institute Innovation Pioneer.
President's Award Second Prize
. - 2017.12, Intel Cup National College Students Software Innovation Competition
National Finals Third prize
🏛️ Invited Talks
- 2024.7, Transformer-based Models in Decision Making. @NJU
- 2023.12, Reining Generalization in Offline Reinforcement Learning via Representation Distinction. @DAI 2023
- 2022.11, The Difficulty of Passive Learning in Deep Reinforcement Learning. @Huawei, Chaspark
- 2022.01, A hierarchical reinforcement learning based optimization framework for large scale dynamic pickup and delivery problems @RLChina
💻 Internships
- 2024.01-2024.07, Qiyuan Lab, AI Foundation Team. Supervised by Chen Chen.
- 2020.11-2023.11, Huawei Noah’s Ark Lab, Decision and Reasoning Team. Supervised by Chenjun Xiao, Dong Li, Chen Chen and Chao Wang.
- 2020.04-2020.10, Huawei Noah’s Ark Lab, Enterprise Intelligence Team. Supervised by Jiawen Lu.
- 2019.07-2019.12, Alibaba, Alimama Target Advertising Team. Supervised by Junqi Jin.
Academic Service
-
Reviewer for Conferences: ICML, ICLR, NeurIPS, AAAI, IJCAI, AAMAS, DAI, CIKM.
-
Student Contactor of RLChina.