no code implementations • 8 Jan 2024 • Shuxiao Ma, Linyuan Wang, Senbao Hou, Bin Yan
Next, we use the contrast loss function to minimize the distance between the image embedding features and the text embedding features to complete the alignment operation of the stimulus image and text information.
no code implementations • 29 Aug 2023 • Shuxiao Ma, Linyuan Wang, Bin Yan
A convolutional network then maps from this multimodal feature space to voxel space, constructing the multimodal visual information encoding network model.