no code implementations • 11 Mar 2024 • Koji Inoue, Bing'er Jiang, Erik Ekstedt, Tatsuya Kawahara, Gabriel Skantze
The results show that a monolingual VAP model trained on one language does not make good predictions when applied to other languages.
1 code implementation • 10 Jan 2024 • Koji Inoue, Bing'er Jiang, Erik Ekstedt, Tatsuya Kawahara, Gabriel Skantze
A demonstration of a real-time and continuous turn-taking prediction system is presented.
no code implementations • 3 May 2023 • Bing'er Jiang, Erik Ekstedt, Gabriel Skantze
Treating the turn-prediction and response-ranking as a one-stage process, our findings suggest that our model can be used as an incremental response ranker, which can be applied in various settings.
no code implementations • 3 May 2023 • Bing'er Jiang, Erik Ekstedt, Gabriel Skantze
Filled pauses (or fillers), such as "uh" and "um", are frequent in spontaneous speech and can serve as a turn-holding cue for the listener, indicating that the current speaker is not done yet.