Search Results for author: Haobo Fu

Found 13 papers, 6 papers with code

Minimizing Weighted Counterfactual Regret with Optimistic Online Mirror Descent

1 code implementation • 22 Apr 2024 • Hang Xu, Kai Li, Bingyun Liu, Haobo Fu, Qiang Fu, Junliang Xing, Jian Cheng

Counterfactual regret minimization (CFR) is a family of algorithms for effectively solving imperfect-information games.

Paper
Code

Reaching Consensus in Cooperative Multi-Agent Reinforcement Learning with Goal Imagination

no code implementations • 5 Mar 2024 • Liangzhou Wang, Kaiwen Zhu, Fengming Zhu, Xinghu Yao, Shujie Zhang, Deheng Ye, Haobo Fu, Qiang Fu, Wei Yang

The common goal is an achievable state with high value, which is obtained by sampling from the distribution of future states.

Multi-agent Reinforcement Learning

Paper
Add Code

Enhance Reasoning for Large Language Models in the Game Werewolf

1 code implementation • 4 Feb 2024 • Shuang Wu, Liwen Zhu, Tao Yang, Shiwei Xu, Qiang Fu, Yang Wei, Haobo Fu

This paper presents an innovative framework that integrates Large Language Models (LLMs) with an external Thinker module to enhance the reasoning capabilities of LLM-based agents.

Prompt Engineering

Paper
Code

Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing

no code implementations • 22 Dec 2023 • Jinmin He, Kai Li, Yifan Zang, Haobo Fu, Qiang Fu, Junliang Xing, Jian Cheng

Multi-task reinforcement learning endeavors to accomplish a set of different tasks with a single policy.

Paper
Add Code

Pointer Networks Trained Better via Evolutionary Algorithms

no code implementations • 2 Dec 2023 • Muyao Zhong, Shengcai Liu, Bingdong Li, Haobo Fu, Ke Tang, Peng Yang

With this advantage, this paper is able to at the first time report the results of solving 1000-dimensional TSPs by training a PtrNet on the same dimensionality, which strongly suggests that scaling up the training instances is in need to improve the performance of PtrNet on solving higher-dimensional COPs.

Combinatorial Optimization Evolutionary Algorithms

Paper
Add Code

Diversity from Human Feedback

no code implementations • 10 Oct 2023 • Ren-Jian Wang, Ke Xue, Yutong Wang, Peng Yang, Haobo Fu, Qiang Fu, Chao Qian

DivHF learns a behavior descriptor consistent with human preference by querying human feedback.

Combinatorial Optimization Ensemble Learning

Paper
Add Code

Maximum Entropy Heterogeneous-Agent Reinforcement Learning

1 code implementation • 19 Jun 2023 • Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang

We embed cooperative MARL problems into probabilistic graphical models, from which we derive the maximum entropy (MaxEnt) objective for MARL.

Multi-agent Reinforcement Learning reinforcement-learning +1

375

Paper
Code

Heterogeneous Multi-agent Zero-Shot Coordination by Coevolution

no code implementations • 9 Aug 2022 • Ke Xue, Yutong Wang, Cong Guan, Lei Yuan, Haobo Fu, Qiang Fu, Chao Qian, Yang Yu

Generating agents that can achieve zero-shot coordination (ZSC) with unseen partners is a new challenge in cooperative multi-agent reinforcement learning (MARL).

Multi-agent Reinforcement Learning

Paper
Add Code

Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game

no code implementations • ICLR 2022 • Haobo Fu, Weiming Liu, Shuang Wu, Yijia Wang, Tao Yang, Kai Li, Junliang Xing, Bin Li, Bo Ma, Qiang Fu, Yang Wei

The deep policy gradient method has demonstrated promising results in many large-scale games, where the agent learns purely from its own experience.

counterfactual Policy Gradient Methods

Paper
Add Code

Cooperative Multi-Agent Reinforcement Learning with Sequential Credit Assignment

1 code implementation • NeurIPS 2021 • Yifan Zang, Jinmin He, Kai Li, Lily Cao, Haobo Fu, Qiang Fu, Junliang Xing

In this paper, we propose a cooperative MARL method with sequential credit assignment (SeCA) that deduces each agent's contribution to the team's success one by one to learn better cooperation.

counterfactual Multi-agent Reinforcement Learning +4

Paper
Code

L2E: Learning to Exploit Your Opponent

no code implementations • 18 Feb 2021 • Zhe Wu, Kai Li, Enmin Zhao, Hang Xu, Meng Zhang, Haobo Fu, Bo An, Junliang Xing

In this work, we propose a novel Learning to Exploit (L2E) framework for implicit opponent modeling.

Paper
Add Code

Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space

5 code implementations • 10 Oct 2018 • Jiechao Xiong, Qing Wang, Zhuoran Yang, Peng Sun, Lei Han, Yang Zheng, Haobo Fu, Tong Zhang, Ji Liu, Han Liu

Most existing deep reinforcement learning (DRL) frameworks consider either discrete action space or continuous action space solely.

reinforcement-learning Reinforcement Learning (RL)

2,652

Paper
Code

PARAMETRIZED DEEP Q-NETWORKS LEARNING: PLAYING ONLINE BATTLE ARENA WITH DISCRETE-CONTINUOUS HYBRID ACTION SPACE

1 code implementation • ICLR 2018 • Jiechao Xiong, Qing Wang, Zhuoran Yang, Peng Sun, Yang Zheng, Lei Han, Haobo Fu, Xiangru Lian, Carson Eisenach, Haichuan Yang, Emmanuel Ekwedike, Bei Peng, Haoyue Gao, Tong Zhang, Ji Liu, Han Liu

Most existing deep reinforcement learning (DRL) frameworks consider action spaces that are either discrete or continuous space.

2,652

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.