Hey, I’m Yi Ma, working at Shanxi University as an associate professor. I received my PhD degree at June 2024 in College of Intelligence and Computing of Tianjin University. I used to be supervised by Professor Jianye Hao in his research group. I have published more than 20 papers in top AI conferences. Besides, I’m a huge fan of basketball, snowboarding and orienteering.

I’m interested in:

  • Reinforcement Learning
  • Offline Reinforcement Learning
  • Embodied AI
  • Application of Deep Reinforcement Learning

If you’re interested in these domains, please send me your CV.

🎓 Education

  • 2020.09 - 2024.06, Tianjin University, PhD
  • 2018.09 - 2020.06, Tianjin University, Master
  • 2014.09 - 2018.06, Tianjin University, Bachelor

📝 Selected Publications

Papers


PS: Authors with equal contribution are marked by *.

NeurIPS 2024
sym

Yi Ma, Jianye Hao, Xiaohan Hu, Yan Zheng, Chenjun Xiao.
Iteratively Refined Behavior Regularization for Offline Reinforcement Learning.
NeurIPS 2024. (CCF A)
[link]

NeurIPS 2024
sym

Jiashun Liu, Jianye Hao, Xiaotian Hao, Yi Ma, Yan Zheng, Yujing Hu, Tangjie Lv.
Unlock the Intermittent Control Ability of Model Free Reinforcement Learning.
NeurIPS 2024. (CCF A)
[link]

NeurIPS 2024
sym

Zibin Dong, Yifu Yuan, Jianye Hao, Fei Ni, Yi Ma, Pengyi Li, Yan Zheng.
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making.
NeurIPS 2024 Datasets and Benchmarks Track. (CCF A)
[link]

ICML 2024
sym

Yi Ma, Jianye Hao, Hebin Liang, Chenjun Xiao.
Rethinking Decision Transformer via Hierarchical Reinforcement Learning.
ICML 2024. (CCF A)
[link]

ICML 2024
sym

Jiashun Liu, Jianye HAO, Yi Ma, Shuyin Xia.
Imagine Big from Small: Unlock the Cognitive Generalization of Deep Reinforcement Learning from Simple Scenarios.
ICML 2024. (CCF A)
[link]

IJCAI 2024
sym

Kai Zhao, Jianye Hao, Yi Ma, Jinyi Liu, Yan Zheng, Zhaopeng Meng.
ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles.
IJCAI 2024. (CCF A)
[link]

ICLR 2024
sym

Yifu Yuan, Jianye Hao, Yi Ma, Zibin Dong, Hebin Liang, Jinyi Liu, Zhixin Feng, Kai Zhao, Yan Zheng.
Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback.
ICLR 2024. (Top AI Conference)
[link]

NeurIPS 2023
sym

Yi Ma, Hongyao Tang, Dong Li, Zhaopeng Meng.
Reining Generalization in Offline Reinforcement Learning via Representation Distinction.
NeurIPS 2023. (CCF A)
[link]

CIKM 2023
sym

Hebin Liang, Zibin Dong, Yi Ma, Xiaotian Hao, Yan Zheng, Jianye Hao.
A Hierarchical Imitation Learning-based Decision Framework for Autonomous Driving.
CIKM 2023. (CCF B)
[link]

AAAI 2023
sym

Hebin Liang* , Yi Ma*, Zilin Cao, Tianyang Liu, Fei Ni, Zhigang Li, Jianye Hao.
SplitNet: A Reinforcement Learning based Sequence Splitting Method for the MinMax Multiple Travelling Salesman Problem.
AAAI 2023. (CCF A)
[link]

CAAI AIR 2023
sym

Yi Ma, Chao Wang, Chen Chen, Jinyi Liu, Zhaopeng Meng, Yan Zheng, Jianye Hao.
OSCAR: OOD State Conservative Offline Reinforcement Learning for S equential Decision Making.
CAAI Aritificial Intelligence Research 2023.
[link]

IJCAI 2022
sym

Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng, Boyan Li, Zhen Wang.
PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations.
IJCAI 2022. (CCF A)
[link]

NeurIPS 2021
sym

Yi Ma, Xiaotian Hao*, Jianye Hao, Jiawen Lu, Xing Liu, Tong Xialiang, Mingxuan Yuan, Zhigang Li, Jie Tang, Zhaopeng Meng.
A hierarchical reinforcement learning based optimization framework for large scale dynamic pickup and delivery problems.
NeurIPS 2021. (CCF A)
[link]

KDD 2021
sym

Fei Ni, Jianye Hao, Jiawen Lu, Xialiang Tong, Mingxuan Yuan, Jiahui Duan, Yi Ma, Kun He
A Multi-Graph Attributed Reinforcement Learning based Optimization Algorithm for Large-scale Hybrid Flow Shop Scheduling Problem.
KDD 2021. (CCF A)
[link]

ICML 2020
sym

Xiaotian Hao*, Zhaoqing Peng*, Yi Ma*, Guan Wang, Junqi Jin, Jianye Hao, Shan Chen, Rongquan Bai, Mingzhou Xie, Miao Xu, Zhenzhe Zheng, Chuan Yu, Han Li, Jian Xu, Kun Gai.
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising.
ICML 2020. (CCF A)
[link]

Patents


  • CN113850414B. Method for Logistics Scheduling Planning Based on Graph Neural Networks and Reinforcement Learning. (First Inventor, Authorized)
  • CN114130034B. Multi Agent Game AI Design Method Based on Attention Mechanism and Reinforcement Learning. (Fifth Inventor, Authorized)
  • CN113947348A. A Method and Device for Order Allocation. (Second Inventor, Under Review)
  • CN113869489A. Complex Game AI Design Method Based on Hierarchical Deep Reinforcement Learning. (Third Inventor, Authorized)
  • CN113869488A. Reinforcement Learning Method for Game AI Agents in Continuous Discrete Mixed Decision Environments. (Third Inventor, Authorized)
  • CN114169421A. Cooperative Exploration Method in Sparse Reward Environments for Multi Agent Systems Based on Intrinsic Motivation. (Fourth Inventor, Under Review)
  • CN114139681A. Meta Reinforcement Learning Method Based on Contrastive Learning and Mutual Information. (Fourth Inventor, Under Review)

🏅 Competitions and Honors

  • 2022.12, NeurIPS 2022 SMARTS Autonomous Driving Competition. First Prize in both tracks. [link]
  • 2021.06, Huawei 2012 Central Research Institute Innovation Pioneer. President's Award Second Prize.
  • 2017.12, Intel Cup National College Students Software Innovation Competition National Finals Third prize

🏛️ Invited Talks

  • 2024.7, Transformer-based Models in Decision Making. @NJU
  • 2023.12, Reining Generalization in Offline Reinforcement Learning via Representation Distinction. @DAI 2023
  • 2022.11, The Difficulty of Passive Learning in Deep Reinforcement Learning. @Huawei, Chaspark
  • 2022.01, A hierarchical reinforcement learning based optimization framework for large scale dynamic pickup and delivery problems @RLChina

💻 Internships

  • 2024.01-2024.07, Qiyuan Lab, AI Foundation Team. Supervised by Chen Chen.
  • 2020.11-2023.11, Huawei Noah’s Ark Lab, Decision and Reasoning Team. Supervised by Chenjun Xiao, Dong Li, Chen Chen and Chao Wang.
  • 2020.04-2020.10, Huawei Noah’s Ark Lab, Enterprise Intelligence Team. Supervised by Jiawen Lu.
  • 2019.07-2019.12, Alibaba, Alimama Target Advertising Team. Supervised by Junqi Jin.

Academic Service

  • Reviewer for Conferences: ICML, ICLR, NeurIPS, AAAI, IJCAI, AAMAS, DAI, CIKM.

  • Student Contactor of RLChina.