no code implementations • 3 Apr 2019 • Shaoying Wang, Haijiang Lai, Yifan Yang, Jian Yin
The following three steps are repeated until convergence: 1) the database network encodes all training samples into binary codes to obtain a whole rank list, 2) the query network is trained based on policy learning to maximize a reward that indicates the performance of the whole ranking list of binary codes, e. g., mean average precision (MAP), and 3) the database network is updated as the query network.