no code implementations • 13 May 2024 • Theodore Jerome Tinker, Kenji Doya, Jun Tani
Two rewards encouraging efficient exploration are the entropy of action policy and curiosity for information gain.
1 code implementation • 29 Jun 2023 • Florian Lalande, Kenji Doya
We compare our method with previous data imputation methods using artificial and real-world data with different data missing scenarios and various data missing rates, and show that our method can cope with complex original data structure, yields lower data imputation errors, and provides probabilistic estimates with higher likelihood than current methods.
1 code implementation • 11 Apr 2023 • Dongqi Han, Kenji Doya, Dongsheng Li, Jun Tani
The habitual behavior is generated by using prior distribution of intention, which is goal-less; and the goal-directed behavior is generated by the posterior distribution of intention, which is conditioned on the goal.
no code implementations • 29 Sep 2021 • Florian Lalande, Kenji Doya
As databases are ubiquitous nowadays, missing values constitute a pervasive problem for data analysis.
no code implementations • ICLR 2022 • Dongqi Han, Tadashi Kozuno, Xufang Luo, Zhao-Yun Chen, Kenji Doya, Yuqing Yang, Dongsheng Li
How to make intelligent decisions is a central problem in machine learning and cognitive science.
no code implementations • 18 Jun 2021 • Dongqi Han, Kenji Doya, Jun Tani
Habitual behavior, which is obtained from the prior distribution of ${z}$, is acquired by reinforcement learning.
no code implementations • 5 Jun 2021 • Kenji Doya
Here we consider the hypothesis that the sensory and motor cortical circuits implement the dual computations for Bayesian inference and optimal control, or perceptual and value-based decision making, respectively.
no code implementations • 15 Mar 2021 • Tadahiro Taniguchi, Hiroshi Yamakawa, Takayuki Nagai, Kenji Doya, Masamichi Sakagami, Masahiro Suzuki, Tomoaki Nakamura, Akira Taniguchi
This approach is based on two ideas: (1) brain-inspired AI, learning human brain architecture to build human-level intelligence, and (2) a probabilistic generative model(PGM)-based cognitive system to develop a cognitive system for developmental robots by integrating PGMs.
no code implementations • 17 Aug 2020 • Eiji Uchibe, Kenji Doya
A forward RL step minimizes the reverse KL estimated by the inverse RL step.
1 code implementation • ICLR 2020 • Dongqi Han, Kenji Doya, Jun Tani
In partially observable (PO) environments, deep reinforcement learning (RL) agents often suffer from unsatisfactory performance, since two problems need to be tackled together: how to extract information from the raw observations to solve the task, and how to improve the policy.
no code implementations • 2 Aug 2019 • Henrik Skibbe, Akiya Watakabe, Ken Nakae, Carlos Enrique Gutierrez, Hiromichi Tsukada, Junichi Hata, Takashi Kawase, Rui Gong, Alexander Woodward, Kenji Doya, Hideyuki Okano, Tetsuo Yamamori, Shin Ishii
Understanding the connectivity in the brain is an important prerequisite for understanding how the brain processes information.
no code implementations • 18 Jun 2019 • Tadashi Kozuno, Dongqi Han, Kenji Doya
We provide detailed theoretical analysis of the new algorithm that shows its efficiency and noise-tolerance inherited from Retrace and advantage learning.
3 code implementations • ICML 2018 • Paavo Parmas, Carl Edward Rasmussen, Jan Peters, Kenji Doya
Previously, the exploding gradient problem has been explained to be central in deep learning and model-based reinforcement learning, because it causes numerical issues and instability in optimization.
Model-based Reinforcement Learning reinforcement-learning +1
1 code implementation • 29 Jan 2019 • Dongqi Han, Kenji Doya, Jun Tani
Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch.
no code implementations • 25 Jul 2018 • Stefan Elfwing, Eiji Uchibe, Kenji Doya
In this study, by adopting features of the EE-RBM approach to feed-forward neural networks, we propose the UnBounded output network (UBnet) which is characterized by three features: (1) unbounded output units; (2) the target value of correct classification is set to a value much greater than one; and (3) the models are trained by a modified mean-squared error objective.
no code implementations • 30 Oct 2017 • Tadashi Kozuno, Eiji Uchibe, Kenji Doya
Approximate dynamic programming algorithms, such as approximate value iteration, have been successfully applied to many complex reinforcement learning tasks, and a better approximate dynamic programming algorithm is expected to further extend the applicability of reinforcement learning to various tasks.
no code implementations • 24 Feb 2017 • Stefan Elfwing, Eiji Uchibe, Kenji Doya
In the OMPAC method, several instances of a reinforcement learning algorithm are run in parallel with small differences in the initial values of the meta-parameters.
no code implementations • 10 Feb 2017 • Stefan Elfwing, Eiji Uchibe, Kenji Doya
First, we propose two activation functions for neural network function approximation in reinforcement learning: the sigmoid-weighted linear unit (SiLU) and its derivative function (dSiLU).
no code implementations • 21 Oct 2015 • Tomoki Tokuda, Junichiro Yoshimoto, Yu Shimizu, Shigeru Toki, Go Okada, Masahiro Takamura, Tetsuya Yamamoto, Shinpei Yoshimura, Yasumasa Okamoto, Shigeto Yamawaki, Kenji Doya
We propose a novel method for multiple clustering that assumes a co-clustering structure (partitions in both rows and columns of the data matrix) in each view.
no code implementations • NeurIPS 2009 • Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto, Kenji Doya
In this paper, we describe a generalized Natural Gradient (gNG) by linearly interpolating the two FIMs and propose an efficient implementation for the gNG learning based on a theory of the estimating function, generalized Natural Actor-Critic (gNAC).