GTM-UVigo Systems for the Query-by-Example Search on Speech Task at MediaEval 2015
In this paper, we present the systems developed by GTMUVigo team for the query by example search on speech task (QUESST) at MediaEval 2015. The systems consist in a fusion of 11 dynamic time warping based systems that use phoneme posteriorgrams for speech representation; the primary system introduces a technique to select the most relevant phonetic units on each phoneme decoder, leading to an improvement of the search results.
PDFDatasets
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Keyword Spotting | QUESST | GTM-UVigo Primary late submission (eval) | Cnxe | 0.871 | # 23 | |
MinCnxe | 0.838 | # 29 | ||||
lowerbound | 0.592 | # 8 | ||||
Keyword Spotting | QUESST | GTM-UVigo Contrastive late submission (eval) | Cnxe | 0.989 | # 47 | |
MinCnxe | 0.852 | # 33 | ||||
lowerbound | 0.613 | # 6 | ||||
Keyword Spotting | QUESST | GTM-UVigo Contrastive late submission (dev) | Cnxe | 0.907 | # 28 | |
MinCnxe | 0.864 | # 36 | ||||
lowerbound | 0.618 | # 5 | ||||
Keyword Spotting | QUESST | GTM-UVigo Contrastive (eval) | Cnxe | 0.999 | # 51 | |
MinCnxe | 0.923 | # 47 | ||||
lowerbound | 0.633 | # 2 | ||||
Keyword Spotting | QUESST | GTM-UVigo Contrastive (dev) | Cnxe | 0.998 | # 48 | |
MinCnxe | 0.918 | # 45 | ||||
lowerbound | 0.635 | # 1 | ||||
Keyword Spotting | QUESST | GTM-UVigo Primary late submission (dev) | Cnxe | 0.875 | # 25 | |
MinCnxe | 0.847 | # 32 | ||||
lowerbound | 0.593 | # 7 | ||||
Keyword Spotting | QUESST | GTM-UVigo Primary (eval) | Cnxe | 0.919 | # 32 | |
MinCnxe | 0.905 | # 42 | ||||
lowerbound | 0.629 | # 3 | ||||
Keyword Spotting | QUESST | GTM-UVigo Primary (dev) | Cnxe | 0.917 | # 30 | |
MinCnxe | 0.905 | # 42 | ||||
lowerbound | 0.627 | # 4 |