no code implementations • SmiLa (LREC) 2022 • Hugo Bohy, Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit
Laughter is not just an audio signal, but an intrinsic relationship of multimodal non-verbal communication, in addition to audio, it includes facial expressions and body movements.
no code implementations • SmiLa (LREC) 2022 • Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit
Women smile more than men although the expressiveness of women is not universally more across all facial actions.
no code implementations • 1 Sep 2023 • Ahmad Hammoudeh, Stéphane Dupont
Image registration (IR) is a process that deforms images to align them with respect to a reference space, making it easier for medical practitioners to examine various medical images in a standardized reference frame, such as having the same rotation and scale.
no code implementations • 30 May 2023 • Omar Seddati, Nathan Hubens, Stéphane Dupont, Thierry Dutoit
Then, we introduce a Relative Triplet Loss (RTL), an adapted triplet loss to overcome those limitations through loss weighting based on anchors similarity.
no code implementations • 14 Sep 2022 • Omar Seddati, Stéphane Dupont, Saïd Mahmoudi, Thierry Dutoit
Sketch-based image retrieval (SBIR) is the task of retrieving natural images (photos) that match the semantics and the spatial configuration of hand-drawn sketch queries.
no code implementations • 20 May 2022 • Hugo Bohy, Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit
Laughter is not just an audio signal, but an intrinsic relationship of multimodal non-verbal communication, in addition to audio, it includes facial expressions and body movements.
no code implementations • 11 Feb 2022 • Ahmad Hammoudeh, Bastien Vanderplaetse, Stéphane Dupont
This work aims at generating captions for soccer videos using deep learning.
1 code implementation • 12 Jun 2021 • Mathilde Brousmiche, Jean Rouat, Stéphane Dupont
Event classification is inherently sequential and multimodal.
no code implementations • 9 Nov 2020 • Bastien Vanderplaetse, Stéphane Dupont
Action spotting and classification are the tasks that consist in finding the temporal anchors of events in a video and determine which event they are.
Ranked #5 on Action Spotting on SoccerNet
1 code implementation • EMNLP (nlpbt) 2020 • Jean-Benoit Delbrouck, Noé Tits, Stéphane Dupont
This paper aims to bring a new lightweight yet powerful solution for the task of Emotion Recognition and Sentiment Analysis.
Ranked #6 on Multimodal Sentiment Analysis on CMU-MOSEI (using extra training data)
no code implementations • 2 Oct 2020 • Mathilde Brousmiche, Stéphane Dupont, Jean Rouat
We introduce the AVECL-UMons dataset for audio-visual event classification and localization in the context of office environments.
1 code implementation • WS 2020 • Jean-Benoit Delbrouck, Noé Tits, Mathilde Brousmiche, Stéphane Dupont
Understanding expressed sentiment and emotions are two crucial factors in human multimodal language.
Ranked #5 on Multimodal Sentiment Analysis on CMU-MOSEI (using extra training data)
1 code implementation • 31 Oct 2019 • Jean-Benoit Delbrouck, Bastien Vanderplaetse, Stéphane Dupont
Recently, generative adversarial networks (GAN) have gathered a lot of interest.
no code implementations • 8 Oct 2019 • Jean-Benoit Delbrouck, Antoine Maiorca, Nathan Hubens, Stéphane Dupont
As new data-sets for real-world visual reasoning and compositional question answering are emerging, it might be needed to use the visual feature extraction as a end-to-end process during training.
no code implementations • 7 Oct 2019 • Jean-Benoit Delbrouck, Stéphane Dupont
Even with the growing interest in problems at the intersection of Computer Vision and Natural Language, grounding (i. e. identifying) the components of a structured description in an image still remains a challenging task.
no code implementations • 22 Nov 2018 • Jean-Benoit Delbrouck, Stéphane Dupont
When searching for an object humans navigate through a scene using semantic information and spatial relationships.
1 code implementation • 15 Oct 2018 • Jean-Benoit Delbrouck, Stéphane Dupont
This paper describes the UMONS solution for the Multimodal Machine Translation Task presented at the third conference on machine translation (WMT18).
no code implementations • 15 Oct 2018 • Jean-Benoit Delbrouck, Stéphane Dupont
So far, the goal has been to maximize scores on automated metric and to do so, one has to come up with a plurality of new modules and techniques.
no code implementations • 19 Jan 2018 • Matei Mancas, Christian Frisson, Joëlle Tilmanne, Nicolas D'Alessandro, Petr Barborka, Furkan Bayansar, Francisco Bernard, Rebecca Fiebrink, Alexis Heloir, Edgar Hemery, Sohaib Laraba, Alexis Moinet, Fabrizio Nunnari, Thierry Ravet, Loïc Reboursière, Alvaro Sarasua, Mickaël Tits, Noé Tits, François Zajéga, Paolo Alborno, Ksenia Kolykhalova, Emma Frid, Damiano Malafronte, Lisanne Huis in't Veld, Hüseyin Cakmak, Kevin El Haddad, Nicolas Riche, Julien Leroy, Pierre Marighetto, Bekir Berker Türker, Hossein Khaki, Roberto Pulisci, Emer Gilmartin, Fasih Haider, Kübra Cengiz, Martin Sulir, Ilaria Torre, Shabbir Marzban, Ramazan Yazıcı, Furkan Burak Bâgcı, Vedat Gazi Kılı, Hilal Sezer, Sena Büsra Yenge, Charles-Alexandre Delestage, Sylvie Leleu-Merviel, Muriel Meyer-Chemenska, Daniel Schmitt, Willy Yvart, Stéphane Dupont, Ozan Can Altiok, Aysegül Bumin, Ceren Dikmen, Ivan Giangreco, Silvan Heller, Emre Külah, Gueorgui Pironkov, Luca Rossetto, Yusuf Sahillioglu, Heiko Schuldt, Omar Seddati, Yusuf Setinkaya, Metin Sezgin, Claudiu Tanase, Emre Toyan, Sean Wood, Doguhan Yeke, Françcois Rocca, Pierre-Henri De Deken, Alessandra Bandrabur, Fabien Grisard, Axel Jean-Caurant, Vincent Courboulay, Radhwan Ben Madhkour, Ambroise Moreau
The 11th Summer Workshop on Multimodal Interfaces eNTERFACE 2015 was hosted by the Numediart Institute of Creative Technologies of the University of Mons from August 10th to September 2015.
1 code implementation • 9 Dec 2017 • Jean-Benoit Delbrouck, Stéphane Dupont
We propose a new and fully end-to-end approach for multimodal translation where the source text encoder modulates the entire visual input processing using conditional batch normalization, in order to compute the most informative image features for our task.
no code implementations • EMNLP 2017 • Jean-Benoit Delbrouck, Stéphane Dupont
In state-of-the-art Neural Machine Translation (NMT), an attention mechanism is used during decoding to enhance the translation.
no code implementations • 4 Jul 2017 • Jean-Benoit Delbrouck, Stéphane Dupont, Omar Seddati
In Multimodal Neural Machine Translation (MNMT), a neural model generates a translated sentence that describes an image, given the image itself and one source descriptions in English.