You were saying? - Spoken Language in the V3C Dataset

15 Dec 2022 · Luca Rossetto ·

This paper presents an analysis of the distribution of spoken language in the V3C video retrieval benchmark dataset based on automatically generated transcripts. It finds that a large portion of the dataset is covered by spoken language. Since language transcripts can be quickly and accurately described, this has implications for retrieval tasks such as known-item search.

PDF Abstract