Search Results for author: Yu-Hsiang Wang

Found 3 papers, 2 papers with code

Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition

no code implementations • 23 May 2024 • Chan-Jan Hsu, Yi-Chang Chen, Feng-Ting Liao, Pei-Chen Ho, Yu-Hsiang Wang, Po-chun Hsu, Da-Shan Shiu

We introduce "Generative Fusion Decoding" (GFD), a novel shallow fusion framework, utilized to integrate Large Language Models (LLMs) into multi-modal text recognition systems such as automatic speech recognition (ASR) and optical character recognition (OCR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models

1 code implementation • 30 May 2023 • Yu-Hsiang Wang, Huang-Yu Chen, Kai-Wei Chang, Winston Hsu, Hung-Yi Lee

In this paper, we introduce MiniSUPERB, a lightweight benchmark that efficiently evaluates SSL speech models with comparable results to SUPERB but lower computational costs significantly.

Self-Supervised Learning

Paper
Code

SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object Tracking

2 code implementations • 16 Nov 2022 • Yu-Hsiang Wang, Jun-Wei Hsieh, Ping-Yang Chen, Ming-Ching Chang, Hung Hin So, Xin Li

Second, we develop a Similarity Matching Cascade (SMC) module with a novel GATE function for robust object matching across consecutive video frames, further enhancing MOT performance.

Ranked #1 on Multi-Object Tracking on MOT20 (using extra training data)

Multi-Object Tracking Multiple Object Tracking +3

159

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.