Search Results for author: Satoshi Nakamura

Found 125 papers, 23 papers with code

Multilingual Machine Translation Evaluation Metrics Fine-tuned on Pseudo-Negative Examples for WMT 2021 Metrics Task

no code implementations • WMT (EMNLP) 2021 • Kosuke Takahashi, Yoichi Ishibashi, Katsuhito Sudoh, Satoshi Nakamura

This paper describes our submission to the WMT2021 shared metrics task.

Paper
Add Code

Using Spoken Word Posterior Features in Neural Machine Translation

no code implementations • IWSLT (EMNLP) 2018 • Kaho Osamura, Takatomo Kano, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

In this paper, a neural sequence-to-sequence ASR is used as feature processing that is trained to produce word posterior features given spoken utterances.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022

no code implementations • IWSLT (ACL) 2022 • Ryo Fukuda, Yuka Ko, Yasumasa Kano, Kosuke Doi, Hirotaka Tokuyama, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

This paper describes NAIST’s simultaneous speech translation systems developed for IWSLT 2022 Evaluation Campaign.

Segmentation Simultaneous Speech-to-Text Translation +1

Paper
Add Code

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

Paper
Add Code

Simultaneous Neural Machine Translation with Prefix Alignment

no code implementations • IWSLT (ACL) 2022 • Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task that requires starting translation before the speaker has finished speaking, so we face a trade-off between latency and accuracy.

Machine Translation Translation

Paper
Add Code

Multi-paraphrase Augmentation to Leverage Neural Caption Translation

no code implementations • IWSLT (EMNLP) 2018 • Johanes Effendi, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

In this paper, we investigate and utilize neural paraphrasing to improve translation quality in neural MT (NMT), which has not yet been much explored.

Machine Translation NMT +1

Paper
Add Code

Is This Translation Error Critical?: Classification-Based Human and Automatic Machine Translation Evaluation Focusing on Critical Errors

no code implementations • EACL (HumEval) 2021 • Katsuhito Sudoh, Kosuke Takahashi, Satoshi Nakamura

Our classification-based approach focuses on such errors using several error type labels, for practical machine translation evaluation in an age of neural machine translation.

Machine Translation Translation

Paper
Add Code

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations • ACL (IWSLT) 2021 • Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Paper
Add Code

Multi-Source Cross-Lingual Constituency Parsing

no code implementations • ICON 2021 • Hour Kaing, Chenchen Ding, Katsuhito Sudoh, Masao Utiyama, Eiichiro Sumita, Satoshi Nakamura

Pretrained multilingual language models have become a key part of cross-lingual transfer for many natural language processing tasks, even those without bilingual information.

Constituency Parsing Cross-Lingual Transfer +1

Paper
Add Code

Named Entity-Factored Transformer for Proper Noun Translation

no code implementations • ICON 2021 • Kohichi Takai, Gen Hattori, Akio Yoneyama, Keiji Yasuda, Katsuhito Sudoh, Satoshi Nakamura

The proposed method applies the Named Entity (NE) fea-ture vector to Factored Transformer for accurate proper noun translation.

Machine Translation named-entity-recognition +5

Paper
Add Code

Pseudo Ambiguous and Clarifying Questions Based on Sentence Structures Toward Clarifying Question Answering System

no code implementations • dialdoc (ACL) 2022 • Yuya Nakano, Seiya Kawano, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

Ambiguous questions are generated by eliminating a part of a sentence considering the sentence structure.

Question Answering Question Generation +2

Paper
Add Code

Large-Scale English-Japanese Simultaneous Interpretation Corpus: Construction and Analyses with Sentence-Aligned Data

no code implementations • ACL (IWSLT) 2021 • Kosuke Doi, Katsuhito Sudoh, Satoshi Nakamura

This paper describes the construction of a new large-scale English-Japanese Simultaneous Interpretation (SI) corpus and presents the results of its analysis.

Sentence

Paper
Add Code

On Knowledge Distillation for Translating Erroneous Speech Transcriptions

no code implementations • ACL (IWSLT) 2021 • Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

Recent studies argue that knowledge distillation is promising for speech translation (ST) using end-to-end models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

NAIST English-to-Japanese Simultaneous Translation System for IWSLT 2021 Simultaneous Text-to-text Task

no code implementations • ACL (IWSLT) 2021 • Ryo Fukuda, Yui Oka, Yasumasa Kano, Yuki Yano, Yuka Ko, Hirotaka Tokuyama, Kosuke Doi, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

This paper describes NAIST’s system for the English-to-Japanese Simultaneous Text-to-text Translation Task in IWSLT 2021 Evaluation Campaign.

Knowledge Distillation Machine Translation +1

Paper
Add Code

TransLLaMa: LLM-based Simultaneous Translation System

no code implementations • 7 Feb 2024 • Roman Koshkin, Katsuhito Sudoh, Satoshi Nakamura

Decoder-only large language models (LLMs) have recently demonstrated impressive capabilities in text generation and reasoning.

Decoder Machine Translation +3

Paper
Add Code

Response Generation for Cognitive Behavioral Therapy with Large Language Models: Comparative Study with Socratic Questioning

no code implementations • 29 Jan 2024 • Kenta Izumi, Hiroki Tanaka, Kazuhiro Shidara, Hiroyoshi Adachi, Daisuke Kanayama, Takashi Kudo, Satoshi Nakamura

By comparing systems that use LLM-generated responses with those that do not, we investigate the impact of generated responses on subjective evaluations such as mood change, cognitive change, and dialogue quality (e. g., empathy).

Response Generation

Paper
Add Code

Average Token Delay: A Duration-aware Latency Metric for Simultaneous Translation

no code implementations • 24 Nov 2023 • Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this work, we propose a novel latency evaluation metric for simultaneous translation called \emph{Average Token Delay} (ATD) that focuses on the duration of partial translations.

Translation

Paper
Add Code

Computational analyses of linguistic features with schizophrenic and autistic traits along with formal thought disorders

no code implementations • 14 Oct 2023 • Takeshi Saga, Hiroki Tanaka, Satoshi Nakamura

We confirmed that an FTD-related subscale, odd speech, was significantly correlated with both the total SPQ and SRS scores, although they themselves were not correlated significantly.

Paper
Add Code

Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data

no code implementations • 14 Jun 2023 • Yuka Ko, Ryo Fukuda, Yuta Nishikawa, Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this paper, we propose an effective way to train a SimulST model using mixed data of SI and offline.

Paper
Add Code

Inter-connection: Effective Connection between Pre-trained Encoder and Decoder for Speech Translation

no code implementations • 26 May 2023 • Yuta Nishikawa, Satoshi Nakamura

In this study, we propose an inter-connection mechanism that aggregates the information from each layer of the speech pre-trained model by weighted sums and inputs into the decoder.

2k Decoder +1

Paper
Add Code

Arukikata Travelogue Dataset

no code implementations • 19 May 2023 • Hiroki Ouchi, Hiroyuki Shindo, Shoko Wakamiya, Yuki Matsuda, Naoya Inoue, Shohei Higashiyama, Satoshi Nakamura, Taro Watanabe

We have constructed Arukikata Travelogue Dataset and released it free of charge for academic research.

Paper
Add Code

Improving Speech Translation Accuracy and Time Efficiency with Fine-tuned wav2vec 2.0-based Speech Segmentation

1 code implementation • 25 Apr 2023 • Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

In this study, we extended SHAS to improve ST translation accuracy and efficiency by splitting speech into shorter segments that correspond to sentences.

Segmentation Translation

Paper
Code

NAIST-SIC-Aligned: an Aligned English-Japanese Simultaneous Interpretation Corpus

no code implementations • 23 Apr 2023 • Jinming Zhao, Yuka Ko, Kosuke Doi, Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

Research has been limited due to the lack of a large-scale training corpus.

Machine Translation Sentence +1

Paper
Add Code

Sketch-based Medical Image Retrieval

1 code implementation • 7 Mar 2023 • Kazuma Kobayashi, Lin Gu, Ryuichiro Hataya, Takaaki Mizuno, Mototaka Miyake, Hirokazu Watanabe, Masamichi Takahashi, Yasuyuki Takamizawa, Yukihiro Yoshida, Satoshi Nakamura, Nobuji Kouno, Amina Bolatkan, Yusuke Kurose, Tatsuya Harada, Ryuji Hamamoto

As a result, our SBMIR system enabled users to overcome previous challenges, including image retrieval based on fine-grained image characteristics, image retrieval without example images, and image retrieval for isolated samples.

Medical Image Retrieval Retrieval

Paper
Code

Modeling Multiple User Interests using Hierarchical Knowledge for Conversational Recommender System

no code implementations • 1 Mar 2023 • Yuka Okuda, Katsuhito Sudoh, Seitaro Shinagawa, Satoshi Nakamura

A conversational recommender system (CRS) is a practical application for item recommendation through natural language conversation.

Recommendation Systems

Paper
Add Code

Whats New? Identifying the Unfolding of New Events in Narratives

1 code implementation • 15 Feb 2023 • Seyed Mahed Mousavi, Shohei Tanaka, Gabriel Roccabruna, Koichiro Yoshino, Satoshi Nakamura, Giuseppe Riccardi

We publish the annotated dataset, annotation materials, and machine learning baseline models for the task of new event extraction for narrative understanding.

Event Extraction Sentence

Paper
Code

Evaluating the Robustness of Discrete Prompts

1 code implementation • 11 Feb 2023 • Yoichi Ishibashi, Danushka Bollegala, Katsuhito Sudoh, Satoshi Nakamura

To address this question, we conduct a systematic study of the robustness of discrete prompts by applying carefully designed perturbations into an application using AutoPrompt and then measure their performance in two Natural Language Inference (NLI) datasets.

Natural Language Inference

Paper
Code

SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain

no code implementations • 8 Jan 2023 • Heli Qi, Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

This paper introduces SpeeChain, an open-source Pytorch-based toolkit designed to develop the machine speech chain for large-scale use.

Data Augmentation

Paper
Add Code

Instance-level Heterogeneous Domain Adaptation for Limited-labeled Sketch-to-Photo Retrieval

1 code implementation • IEEE Transactions on Multimedia 2020 • Fan Yang, Yang Wu, Zheng Wang, Xiang Li, Sakriani Sakti, Satoshi Nakamura

Therefore, previous works pre-train their models on rich-labeled photo retrieval data (i. e., source domain) and then fine-tune them on the limited-labeled sketch-to-photo retrieval data (i. e., target domain).

Ranked #1 on Image Retrieval on PKU-Reid

Domain Adaptation Image Retrieval +1

Paper
Code

Average Token Delay: A Latency Metric for Simultaneous Translation

no code implementations • 22 Nov 2022 • Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this work, we propose a novel latency evaluation metric called Average Token Delay (ATD) that focuses on the end timings of partial translations in simultaneous translation.

Translation

Paper
Add Code

E2E Refined Dataset

1 code implementation • 1 Nov 2022 • Keisuke Toyama, Katsuhito Sudoh, Satoshi Nakamura

Although the well-known MR-to-text E2E dataset has been used by many researchers, its MR-text pairs include many deletion/insertion/substitution errors.

Paper
Code

Subspace Representations for Soft Set Operations and Sentence Similarities

1 code implementation • 24 Oct 2022 • Yoichi Ishibashi, Sho Yokoi, Katsuhito Sudoh, Satoshi Nakamura

In the field of natural language processing (NLP), continuous vector representations are crucial for capturing the semantic meanings of individual words.

Retrieval Semantic Textual Similarity +2

Paper
Code

Actor-identified Spatiotemporal Action Detection --- Detecting Who Is Doing What in Videos

1 code implementation • 27 Aug 2022 • Fan Yang, Norimichi Ukita, Sakriani Sakti, Satoshi Nakamura

By using MOT, the spatiotemporal boundary of each actor is obtained and assigned to a unique actor identity.

Action Classification Action Detection +3

Paper
Code

USB: A Unified Semi-supervised Learning Benchmark for Classification

4 code implementations • 12 Aug 2022 • Yidong Wang, Hao Chen, Yue Fan, Wang Sun, Ran Tao, Wenxin Hou, RenJie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang

We further provide the pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning.

Ranked #2 on Semi-Supervised Image Classification on CIFAR-100, 400 Labels

General Classification Semi-Supervised Image Classification

1,204

Paper
Code

Speech Artifact Removal from EEG Recordings of Spoken Word Production with Tensor Decomposition

no code implementations • 1 Jun 2022 • Holy Lovenia, Hiroki Tanaka, Sakriani Sakti, Ayu Purwarianti, Satoshi Nakamura

Research about brain activities involving spoken word production is considerably underdeveloped because of the undiscovered characteristics of speech artifacts, which contaminate electroencephalogram (EEG) signals and prevent the inspection of the underlying cognitive processes.

blind source separation EEG +1

Paper
Add Code

Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing

no code implementations • 14 May 2022 • Heli Qi, Sashi Novitasari, Sakriani Sakti, Satoshi Nakamura

The existing paradigm of semi-supervised S2S ASR utilizes SpecAugment as data augmentation and requires a static teacher model to produce pseudo transcripts for untranscribed speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Representing 'how you say' with 'what you say': English corpus of focused speech and text reflecting corresponding implications

no code implementations • 29 Mar 2022 • Naoaki Suzuki, Satoshi Nakamura

As a type of paralinguistic information, English speech uses sentence stress, the heaviest prominence within a sentence, to convey emphasis.

Sentence Translation

Paper
Add Code

Applying Syntax$\unicode{x2013}$Prosody Mapping Hypothesis and Prosodic Well-Formedness Constraints to Neural Sequence-to-Sequence Speech Synthesis

no code implementations • 29 Mar 2022 • Kei Furukawa, Takeshi Kishiyama, Satoshi Nakamura

End-to-end text-to-speech synthesis (TTS), which generates speech sounds directly from strings of texts or phonemes, has improved the quality of speech synthesis over the conventional TTS.

Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation

1 code implementation • 29 Mar 2022 • Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

We also propose a hybrid method that combines VAD and the above speech segmentation method.

Binary Classification Segmentation +2

1,890

Paper
Code

Simultaneous Neural Machine Translation with Constituent Label Prediction

no code implementations • WMT (EMNLP) 2021 • Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task in which translation begins before the speaker has finished speaking, so it is important to decide when to start the translation process.

Machine Translation Translation

Paper
Add Code

Using Perturbed Length-aware Positional Encoding for Non-autoregressive Neural Machine Translation

no code implementations • 29 Jul 2021 • Yui Oka, Katsuhito Sudoh, Satoshi Nakamura

Non-autoregressive neural machine translation (NAT) usually employs sequence-level knowledge distillation using autoregressive neural machine translation (AT) as its teacher model.

Knowledge Distillation Machine Translation +1

Paper
Add Code

ARTA: Collection and Classification of Ambiguous Requests and Thoughtful Actions

1 code implementation • SIGDIAL (ACL) 2021 • Shohei Tanaka, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

In order to train the classification model on such training data, we applied the positive/unlabeled (PU) learning method, which assumes that only a part of the data is labeled with positive examples.

Classification

Paper
Code

Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings

no code implementations • COLING 2020 • Yui Oka, Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Since length constraints with exact target sentence lengths degrade translation performance, we add random noise within a certain window size to the length constraints in the PE during the training.

Machine Translation Sentence +1

Paper
Add Code

Improving Spoken Language Understanding by Wisdom of Crowds

no code implementations • COLING 2020 • Koichiro Yoshino, Kana Ikeuchi, Katsuhito Sudoh, Satoshi Nakamura

Spoken language understanding (SLU), which converts user requests in natural language to machine-interpretable expressions, is becoming an essential task.

Data Augmentation Spoken Language Understanding

Paper
Add Code

Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS

no code implementations • 10 Nov 2020 • Katsuhito Sudoh, Takatomo Kano, Sashi Novitasari, Tomoya Yanagita, Sakriani Sakti, Satoshi Nakamura

This paper presents a newly developed, simultaneous neural speech-to-speech translation system and its evaluation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

Augmenting Images for ASR and TTS through Single-loop and Dual-loop Multimodal Chain Framework

no code implementations • 4 Nov 2020 • Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Previous research has proposed a machine speech chain to enable automatic speech recognition (ASR) and text-to-speech synthesis (TTS) to assist each other in semi-supervised learning and to avoid the need for a large amount of paired speech and text data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

Incremental Machine Speech Chain Towards Enabling Listening while Speaking in Real-time

no code implementations • 4 Nov 2020 • Sashi Novitasari, Andros Tjandra, Tomoya Yanagita, Sakriani Sakti, Satoshi Nakamura

By contrast, humans can listen to what hey speak in real-time, and if there is a delay in hearing, they won't be able to continue speaking.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition

no code implementations • 4 Nov 2020 • Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

One main reason is because the model needs to decide the incremental steps and learn the transcription that aligns with the current short speech segment.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis

no code implementations • LREC 2020 • Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

We then develop ASR and TTS of ethnic languages by utilizing Indonesian ASR and TTS in a cross-lingual machine speech chain framework with only text or only speech data removing the need for paired speech-text data of those ethnic languages.

Machine Translation speech-recognition +3

Paper
Add Code

Image Captioning with Visual Object Representations Grounded in the Textual Modality

no code implementations • 19 Oct 2020 • Dušan Variš, Katsuhito Sudoh, Satoshi Nakamura

We present our work in progress exploring the possibilities of a shared embedding space between textual and visual modality.

Image Captioning Object +3

Paper
Add Code

ReMOTS: Self-Supervised Refining Multi-Object Tracking and Segmentation

no code implementations • 7 Jul 2020 • Fan Yang, Xin Chang, Chenyu Dang, Ziqiang Zheng, Sakriani Sakti, Satoshi Nakamura, Yang Wu

We aim to improve the performance of Multiple Object Tracking and Segmentation (MOTS) by refinement.

Ranked #1 on Multi-Object Tracking on MOTS20

Multi-Object Tracking Multi-Object Tracking and Segmentation +2

Paper
Add Code

Reflection-based Word Attribute Transfer

2 code implementations • ACL 2020 • Yoichi Ishibashi, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

For transferring king into queen in this analogy-based manner, we subtract a difference vector man - woman based on the knowledge that king is male.

Attribute Word Attribute Transfer +1

Paper
Code

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model

no code implementations • ACL 2020 • Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura

Our experiments show that our proposed method using Cross-lingual Language Model (XLM) trained with a translation language modeling (TLM) objective achieves a higher correlation with human judgments than a baseline method that uses only hypothesis and reference sentences.

Language Modelling Machine Translation +2

Paper
Add Code

NAIST's Machine Translation Systems for IWSLT 2020 Conversational Speech Translation Task

no code implementations • WS 2020 • Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

This paper describes NAIST{'}s NMT system submitted to the IWSLT 2020 conversational speech translation task.

Domain Adaptation Machine Translation +3

Paper
Add Code

Transformer VQ-VAE for Unsupervised Unit Discovery and Speech Synthesis: ZeroSpeech 2020 Challenge

no code implementations • 24 May 2020 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In this paper, we report our submitted system for the ZeroSpeech 2020 challenge on Track 2019.

Speech Synthesis

Paper
Add Code

Emotional Speech Corpus for Persuasive Dialogue System

no code implementations • LREC 2020 • Sara Asai, Koichiro Yoshino, Seitaro Shinagawa, Sakriani Sakti, Satoshi Nakamura

Expressing emotion is known as an efficient way to persuade one{'}s dialogue partner to accept one{'}s claim or proposal.

Paper
Add Code

Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments

no code implementations • 23 Mar 2020 • Koichiro Yoshino, Kohei Wakimoto, Yuta Nishimura, Satoshi Nakamura

Two reasons make it challenging to apply existing sequence-to-sequence models to this mapping: 1) it is hard to prepare a large-scale dataset for any kind of robots and their environment, and 2) there is a gap between the number of samples obtained from robot action observations and generated word sequences of captions.

Caption Generation Chunking +1

Paper
Add Code

Simultaneous Neural Machine Translation using Connectionist Temporal Classification

no code implementations • 27 Nov 2019 • Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous machine translation is a variant of machine translation that starts the translation process before the end of an input.

Classification General Classification +2

Paper
Add Code

Using Panoramic Videos for Multi-person Localization and Tracking in a 3D Panoramic Coordinate

1 code implementation • 24 Nov 2019 • Fan Yang, Feiran Li, Yang Wu, Sakriani Sakti, Satoshi Nakamura

3D panoramic multi-person localization and tracking are prominent in many applications, however, conventional methods using LiDAR equipment could be economically expensive and also computationally inefficient due to the processing of point cloud data.

Ranked #1 on Multi-Object Tracking on MOT15_3D (using extra training data)

Multi-Object Tracking

Paper
Code

Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

1 code implementation • 23 Oct 2019 • Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura, Geoffrey Zweig

As our motivation is to allow acoustic models to re-examine their input features in light of partial hypotheses we introduce intermediate model heads and loss function.

Paper
Code

Speech-to-speech Translation between Untranscribed Unknown Languages

no code implementations • 2 Oct 2019 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Second, we train a sequence-to-sequence model that directly maps the source language speech to the target language's discrete representation.

Speech-to-Speech Translation Translation

Paper
Add Code

Neural Conversation Model Controllable by Given Dialogue Act Based on Adversarial Learning and Label-aware Objective

no code implementations • WS 2019 • Seiya Kawano, Koichiro Yoshino, Satoshi Nakamura

We introduce an adversarial learning framework for the task of generating conditional responses with a new objective to a discriminator, which explicitly distinguishes sentences by using labels.

Paper
Add Code

Make Skeleton-based Action Recognition Model Smaller, Faster and Better

3 code implementations • arXiv 2019 • Fan Yang, Sakriani Sakti, Yang Wu, Satoshi Nakamura

Although skeleton-based action recognition has achieved great success in recent years, most of the existing methods may suffer from a large model size and slow execution speed.

Ranked #1 on Hand Gesture Recognition on DHG-14

Action Recognition Hand Gesture Recognition +1

253

Paper
Code

Conversational Response Re-ranking Based on Event Causality and Role Factored Tensor Event Embedding

2 code implementations • WS 2019 • Shohei Tanaka, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

We propose a novel method for selecting coherent and diverse responses for a given dialogue context.

Re-Ranking

Paper
Code

Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain

no code implementations • 3 Jun 2019 • Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Previously, a machine speech chain, which is based on sequence-to-sequence deep learning, was proposed to mimic speech perception and production behavior.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

An Incremental Turn-Taking Model For Task-Oriented Dialog Systems

2 code implementations • 28 May 2019 • Andrei C. Coman, Koichiro Yoshino, Yukitoshi Murase, Satoshi Nakamura, Giuseppe Riccardi

To identify the point of maximal understanding in an ongoing utterance, we a) implement an incremental Dialog State Tracker which is updated on a token basis (iDST) b) re-label the Dialog State Tracking Challenge 2 (DSTC2) dataset and c) adapt it to the incremental turn-taking experimental scenario.

dialog state tracking

Paper
Code

VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019

no code implementations • 27 May 2019 • Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li, Satoshi Nakamura

Our proposed approach significantly improved the intelligibility (in CER), the MOS, and discrimination ABX scores compared to the official ZeroSpeech 2019 baseline or even the topline.

Clustering

Paper
Add Code

Optimization of Information-Seeking Dialogue Strategy for Argumentation-Based Dialogue System

no code implementations • 26 Nov 2018 • Hisao Katsumi, Takuya Hiraoka, Koichiro Yoshino, Kazeto Yamamoto, Shota Motoura, Kunihiko Sadamasa, Satoshi Nakamura

It is required that these systems have sufficient supporting information to argue their claims rationally; however, the systems often do not have enough of such information in realistic situations.

Paper
Add Code

Another Diversity-Promoting Objective Function for Neural Dialogue Generation

1 code implementation • 20 Nov 2018 • Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

Although generation-based dialogue systems have been widely researched, the response generations by most existing systems have very low diversities.

Dialogue Generation

Paper
Code

End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator

no code implementations • 31 Oct 2018 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In our previous work, we applied a speech chain mechanism as a semi-supervised learning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Multi-Source Neural Machine Translation with Data Augmentation

no code implementations • IWSLT (EMNLP) 2018 • Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, Satoshi Nakamura

By using information from these multiple sources, these systems achieve large gains in accuracy.

Data Augmentation Machine Translation +2

Paper
Add Code

Training Neural Machine Translation using Word Embedding-based Loss

no code implementations • 30 Jul 2018 • Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation.

Decoder Machine Translation +3

Paper
Add Code

Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model

no code implementations • 22 Jul 2018 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In this paper, we propose two ideas to improve sequence-to-sequence model performance by enhancing the attention module.

Decoder Sequence-To-Sequence Speech Recognition +1

Paper
Add Code

Unsupervised Counselor Dialogue Clustering for Positive Emotion Elicitation in Neural Dialogue System

no code implementations • WS 2018 • Nurul Lubis, Sakriani Sakti, Koichiro Yoshino, Satoshi Nakamura

Positive emotion elicitation seeks to improve user{'}s emotional state through dialogue system interaction, where a chat-based scenario is layered with an implicit goal to address user{'}s emotional needs.

Clustering Emotion Recognition +2

Paper
Add Code

Multi-Source Neural Machine Translation with Missing Data

no code implementations • WS 2018 • Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, Satoshi Nakamura

This study focuses on the use of incomplete multilingual corpora in multi-encoder NMT and mixture of NMT experts and examines a very simple implementation where missing source translations are replaced by a special symbol <NULL>.

Machine Translation NMT +1

Paper
Add Code

Construction of English-French Multimodal Affective Conversational Corpus from TV Dramas

no code implementations • LREC 2018 • Sashi Novitasari, Quoc Truong Do, Sakriani Sakti, Dessi Lestari, Satoshi Nakamura

Emotion Recognition Speech Recognition +1

Paper
Add Code

Dialogue Scenario Collection of Persuasive Dialogue with Emotional Expressions via Crowdsourcing

no code implementations • LREC 2018 • Koichiro Yoshino, Yoko Ishikawa, Masahiro Mizukami, Yu Suzuki, Sakriani Sakti, Satoshi Nakamura

Dialogue Management

Paper
Add Code

Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags

no code implementations • LREC 2018 • Koichiro Yoshino, Hiroki Tanaka, Kyoshiro Sugiyama, Makoto Kondo, Satoshi Nakamura

Slot Filling Spoken Dialogue Systems +1

Paper
Add Code

Guiding Neural Machine Translation with Retrieved Translation Pieces

no code implementations • NAACL 2018 • Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura

Specifically, for an input sentence, we use a search engine to retrieve sentence pairs whose source sides are similar with the input sentence, and then collect $n$-grams that are both in the retrieved target sentences and aligned with words that match in the source sentences, which we call "translation pieces".

Machine Translation NMT +3

Paper
Add Code

Machine Speech Chain with One-shot Speaker Adaptation

no code implementations • 28 Mar 2018 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In the speech chain loop mechanism, ASR also benefits from the ability to further learn an arbitrary speaker's characteristics from the generated speech waveform, resulting in a significant improvement in the recognition rate.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Tensor Decomposition for Compressing Recurrent Neural Network

1 code implementation • 28 Feb 2018 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In the machine learning fields, Recurrent Neural Network (RNN) has become a popular architecture for sequential data modeling.

Tensor Decomposition

Paper
Code

Interactive Image Manipulation with Natural Language Instruction Commands

no code implementations • 23 Feb 2018 • Seitaro Shinagawa, Koichiro Yoshino, Sakriani Sakti, Yu Suzuki, Satoshi Nakamura

We propose an interactive image-manipulation system with natural language instruction, which can generate a target image from a source image and an instruction that describes the difference between the source and the target image.

Image Generation Image Manipulation

Paper
Add Code

Structured-based Curriculum Learning for End-to-end English-Japanese Speech Translation

no code implementations • 13 Feb 2018 • Takatomo Kano, Sakriani Sakti, Satoshi Nakamura

Sequence-to-sequence attentional-based neural network architectures have been shown to provide a powerful model for machine translation and speech recognition.

Decoder Machine Translation +3

Paper
Add Code

Acquisition and Assessment of Semantic Content for the Generation of Elaborateness and Indirectness in Spoken Dialogue Systems

no code implementations • IJCNLP 2017 • Louisa Pragst, Koichiro Yoshino, Wolfgang Minker, Satoshi Nakamura, Stefan Ultes

Defining all possible system actions in a dialogue system by hand is a tedious work.

Cultural Vocal Bursts Intensity Prediction Spoken Dialogue Systems

Paper
Add Code

Improving Neural Machine Translation through Phrase-based Forced Decoding

no code implementations • IJCNLP 2017 • Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura

Compared to traditional statistical machine translation (SMT), neural machine translation (NMT) often sacrifices adequacy for the sake of fluency.

Machine Translation NMT +1

Paper
Add Code

A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task

no code implementations • WS 2017 • Yusuke Oda, Katsuhito Sudoh, Satoshi Nakamura, Masao Utiyama, Eiichiro Sumita

This paper describes the details about the NAIST-NICT machine translation system for WAT2017 English-Japanese Scientific Paper Translation Task.

Decoder Machine Translation +1

Paper
Add Code

Sequence-to-Sequence ASR Optimization via Reinforcement Learning

no code implementations • 30 Oct 2017 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Despite the success of sequence-to-sequence approaches in automatic speech recognition (ASR) systems, the models still suffer from several problems, mainly due to the mismatch between the training and inference conditions.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Attention-based Wav2Text with Feature Transfer Learning

no code implementations • 22 Sep 2017 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In this paper, we construct the first end-to-end attention-based encoder-decoder model to process directly from raw speech waveform to the text transcription.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Transcribing Against Time

no code implementations • 15 Sep 2017 • Matthias Sperber, Graham Neubig, Jan Niehues, Satoshi Nakamura, Alex Waibel

We investigate the problem of manually correcting errors from an automatic speech transcript in a cost-sensitive fashion.

Paper
Add Code

Tree as a Pivot: Syntactic Matching Methods in Pivot Translation

no code implementations • WS 2017 • Akiva Miura, Graham Neubig, Katsuhito Sudoh, Satoshi Nakamura

Machine Translation Translation

Paper
Add Code

NICT-NAIST System for WMT17 Multimodal Translation Task

no code implementations • WS 2017 • Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura

Image Retrieval Multimodal Machine Translation +1

Paper
Add Code

Information Navigation System with Discovering User Interests

no code implementations • WS 2017 • Koichiro Yoshino, Yu Suzuki, Satoshi Nakamura

We demonstrate an information navigation system for sightseeing domains that has a dialogue interface for discovering user interests for tourist activities.

Semantic Textual Similarity Speech Recognition

Paper
Add Code

Listening while Speaking: Speech Chain by Deep Learning

no code implementations • 16 Jul 2017 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In this paper, we take a step further and develop a closed-loop speech chain model based on deep learning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation

no code implementations • WS 2017 • Makoto Morishita, Yusuke Oda, Graham Neubig, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

Training of neural machine translation (NMT) models usually uses mini-batches for efficiency purposes.

Machine Translation NMT +2

Paper
Add Code

Gated Recurrent Neural Tensor Network

no code implementations • 7 Jun 2017 • Andros Tjandra, Sakriani Sakti, Ruli Manurung, Mirna Adriani, Satoshi Nakamura

Our proposed RNNs, which are called a Long-Short Term Memory Recurrent Neural Tensor Network (LSTMRNTN) and Gated Recurrent Unit Recurrent Neural Tensor Network (GRURNTN), are made by combining the LSTM and GRU RNN models with the tensor product.

Language Modelling

Paper
Add Code

Analysis of the Effect of Dependency Information on Predicate-Argument Structure Analysis and Zero Anaphora Resolution

no code implementations • 31 May 2017 • Koichiro Yoshino, Shinsuke Mori, Satoshi Nakamura

This paper investigates and analyzes the effect of dependency information on predicate-argument structure analysis (PASA) and zero anaphora resolution (ZAR) for Japanese, and shows that a straightforward approach of PASA and ZAR works effectively even if dependency information was not available.

Dependency Parsing POS

Paper
Add Code

Compressing Recurrent Neural Network with Tensor Train

no code implementations • 23 May 2017 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Recurrent Neural Network (RNN) are a popular choice for modeling temporal and sequential tasks and achieve many state-of-the-art performance on various complex problems.

Paper
Add Code

Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing

no code implementations • IJCNLP 2017 • Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In this paper, we propose a novel attention mechanism that has local and monotonic properties.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Neural Machine Translation via Binary Code Prediction

no code implementations • ACL 2017 • Yusuke Oda, Philip Arthur, Graham Neubig, Koichiro Yoshino, Satoshi Nakamura

In this paper, we propose a new method for calculating the output layer in neural machine translation systems.

Machine Translation Translation

Paper
Add Code

Learning a Lexicon and Translation Model from Phoneme Lattices

1 code implementation • EMNLP 2016 • Oliver Adams, Graham Neubig, Trevor Cohn, Steven Bird, Quoc Truong Do, Satoshi Nakamura

Translation

Paper
Code

Cultural Communication Idiosyncrasies in Human-Computer Interaction

no code implementations • WS 2016 • Juliana Miehle, Koichiro Yoshino, Louisa Pragst, Stefan Ultes, Satoshi Nakamura, Wolfgang Minker

Spoken Dialogue Systems

Paper
Add Code

Analyzing the Effect of Entrainment on Dialogue Acts

no code implementations • WS 2016 • Masahiro Mizukami, Koichiro Yoshino, Graham Neubig, David Traum, Satoshi Nakamura

Paper
Add Code

A Continuous Space Rule Selection Model for Syntax-based Statistical Machine Translation

no code implementations • ACL 2016 • Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura

Machine Translation Translation

Paper
Add Code

Incorporating Discrete Translation Lexicons into Neural Machine Translation

2 code implementations • EMNLP 2016 • Philip Arthur, Graham Neubig, Satoshi Nakamura

Neural machine translation (NMT) often makes mistakes in translating low-frequency content words that are essential to understanding the meaning of the sentence.

Machine Translation NMT +2

Paper
Code

Selecting Syntactic, Non-redundant Segments in Active Learning for Machine Translation

no code implementations • NAACL 2016 • Akiva Miura, Graham Neubig, Michael Paul, Satoshi Nakamura

Active Learning Machine Translation +1

Paper
Add Code

Construction of Japanese Audio-Visual Emotion Database and Its Application in Emotion Recognition

no code implementations • LREC 2016 • Nurul Lubis, R Gomez, y, Sakriani Sakti, Keisuke Nakamura, Koichiro Yoshino, Satoshi Nakamura, Kazuhiro Nakadai

Emotional aspects play a vital role in making human communication a rich and dynamic experience.

Emotion Recognition

Paper
Add Code

Optimizing Computer-Assisted Transcription Quality with Iterative User Interfaces

no code implementations • LREC 2016 • Matthias Sperber, Graham Neubig, Satoshi Nakamura, Alex Waibel

Our goal is to improve the human transcription quality via appropriate user interface design.

Paper
Add Code

Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015

no code implementations • WS 2015 • Graham Neubig, Makoto Morishita, Satoshi Nakamura

We further perform a detailed analysis of reasons for this increase, finding that the main contributions of the neural models lie in improvement of the grammatical correctness of the output, as opposed to improvements in lexical choice of content words.

Machine Translation Translation

Paper
Add Code

A Binarized Neural Network Joint Model for Machine Translation

no code implementations • EMNLP 2015 • Jingyi Zhang, Masao Utiyama, Eiichiro Sumita, Graham Neubig, Satoshi Nakamura

Language Modelling Machine Translation +1

Paper
Add Code

An Investigation of Machine Translation Evaluation Metrics in Cross-lingual Question Answering

no code implementations • WS 2015 • Kyoshiro Sugiyama, Masahiro Mizukami, Graham Neubig, Koichiro Yoshino, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

Cross-Lingual Question Answering Machine Translation +1

Paper
Add Code

Reinforcement Learning in Multi-Party Trading Dialog

no code implementations • WS 2015 • Takuya Hiraoka, Kallirroi Georgila, Elnaz Nouri, David Traum, Satoshi Nakamura

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Improving Pivot Translation by Remembering the Pivot

no code implementations • IJCNLP 2015 • Akiva Miura, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

Language Modelling Machine Translation +1

Paper
Add Code

Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents

no code implementations • IJCNLP 2015 • Yusuke Oda, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

Boundary Detection Machine Translation +1

Paper
Add Code

Ckylark: A More Robust PCFG-LA Parser

1 code implementation • NAACL 2015 • Yusuke Oda, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

Machine Translation Natural Language Inference

Paper
Code

Semantic Parsing of Ambiguous Input through Paraphrasing and Verification

no code implementations • TACL 2015 • Philip Arthur, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

We propose a new method for semantic parsing of ambiguous and ungrammatical input, such as search queries.

Language Modelling Semantic Parsing +1

Paper
Add Code

Rule-based Syntactic Preprocessing for Syntax-based Machine Translation

no code implementations • WS 2014 • Yuto Hatakoshi, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

Machine Translation Translation

Paper
Add Code

Discriminative Language Models as a Tool for Machine Translation Error Analysis

1 code implementation • COLING 2014 • Koichi Akabe, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

Machine Translation Translation

Paper
Code

Reinforcement Learning of Cooperative Persuasive Dialogue Policies using Framing

no code implementations • COLING 2014 • Takuya Hiraoka, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Linguistic and Acoustic Features for Automatic Identification of Autism Spectrum Disorders in Children's Narrative

no code implementations • WS 2014 • Hiroki Tanaka, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura

Paper
Add Code

Optimizing Segmentation Strategies for Simultaneous Speech Translation

no code implementations • ACL 2014 • Yusuke Oda, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

Machine Translation Speech Recognition +1

Paper
Add Code

Collection of a Simultaneous Translation Corpus for Comparative Analysis

no code implementations • LREC 2014 • Hiroaki Shimizu, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura

This makes it possible to compare translation data with simultaneous interpretation data.

Translation

Paper
Add Code

Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and A Network-based ASR System

no code implementations • LREC 2014 • Sakriani Sakti, Keigo Kubo, Sho Matsumiya, Graham Neubig, Tomoki Toda, Satoshi Nakamura, Fumihiro Adachi, Ryosuke Isotani

This paper outlines the recent development on multilingual medical data and multilingual speech recognition system for network-based speech-to-speech translation in the medical domain.

Machine Translation speech-recognition +5