Measurement-based adaptation protocol with quantum reinforcement learning

14 Mar 2018  ·  F. Albarrán-Arriagada, J. C. Retamal, E. Solano, L. Lamata ·

Machine learning employs dynamical algorithms that mimic the human capacity to learn, where the reinforcement learning ones are among the most similar to humans in this respect. On the other hand, adaptability is an essential aspect to perform any task efficiently in a changing environment, and it is fundamental for many purposes, such as natural selection. Here, we propose an algorithm based on successive measurements to adapt one quantum state to a reference unknown state, in the sense of achieving maximum overlap. The protocol naturally provides many identical copies of the reference state, such that in each measurement iteration more information about it is obtained. In our protocol, we consider a system composed of three parts, the "environment" system, which provides the reference state copies; the register, which is an auxiliary subsystem that interacts with the environment to acquire information from it; and the agent, which corresponds to the quantum state that is adapted by digital feedback with input corresponding to the outcome of the measurements on the register. With this proposal we can achieve an average fidelity between the environment and the agent of more than $90\% $ with less than $30$ iterations of the protocol. In addition, we extend the formalism to $ d $-dimensional states, reaching an average fidelity of around $80\% $ in less than $400$ iterations for $d=$ 11, for a variety of genuinely quantum and semiclassical states. This work paves the way for the development of quantum reinforcement learning protocols using quantum data and for the future deployment of semi-autonomous quantum systems.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here