no code implementations • 2 Apr 2024 • Kei Sawada, Tianyu Zhao, Makoto Shing, Kentaro Mitsui, Akio Kaga, Yukiya Hono, Toshiaki Wakatsuki, Koh Mitsuda
AI democratization aims to create a world in which the average person can utilize AI techniques.
no code implementations • 6 Dec 2023 • Yukiya Hono, Koh Mitsuda, Tianyu Zhao, Kentaro Mitsui, Toshiaki Wakatsuki, Kei Sawada
Advances in machine learning have made it possible to perform various text and speech processing tasks, including automatic speech recognition (ASR), in an end-to-end (E2E) manner.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 2 Oct 2023 • Kentaro Mitsui, Yukiya Hono, Kei Sawada
The advent of large language models (LLMs) has made it possible to generate natural written dialogues between two agents.
no code implementations • 28 Feb 2023 • Kentaro Mitsui, Yukiya Hono, Kei Sawada
The two primary frameworks used for talking face generation comprise a text-driven framework, which generates synchronized speech and talking faces from text, and a speech-driven framework, which generates talking faces from speech.
no code implementations • 14 Feb 2023 • AprilPyone MaungMaung, Makoto Shing, Kentaro Mitsui, Kei Sawada, Fumio Okura
To this end, we leverage knowledge from recent large-scale pre-trained generative models, resulting in text-guided sketch-to-photo synthesis without the need for reference images.
no code implementations • 24 Jun 2022 • Kentaro Mitsui, Tianyu Zhao, Kei Sawada, Yukiya Hono, Yoshihiko Nankaku, Keiichi Tokuda
A style encoder that extracts a latent speaking style representation from speech is trained jointly with TTS.
no code implementations • 28 Sep 2021 • Kentaro Mitsui, Kei Sawada
In this study, we propose a method to handle multiple sampling rates in a single NV, called the MSR-NV.
no code implementations • 7 Aug 2020 • Kentaro Mitsui, Tomoki Koriyama, Hiroshi Saruwatari
We propose a framework for multi-speaker speech synthesis using deep Gaussian processes (DGPs); a DGP is a deep architecture of Bayesian kernel regressions and thus robust to overfitting.