1 code implementation • 27 Mar 2024 • Shuai Xiang, Pieter M. Blok, James Burridge, Haozhou Wang, Wei Guo
The diverse and high-quality content generated by recent generative models demonstrates the great potential of using synthetic data to train downstream models.
no code implementations • 17 May 2021 • Etienne David, Mario Serouart, Daniel Smith, Simon Madec, Kaaviya Velumani, Shouyang Liu, Xu Wang, Francisco Pinto Espinosa, Shahameh Shafiee, Izzat S. A. Tahir, Hisashi Tsujimoto, Shuhei Nasuda, Bangyou Zheng, Norbert Kichgessner, Helge Aasen, Andreas Hund, Pouria Sadhegi-Tehran, Koichi Nagasawa, Goro Ishikawa, Sébastien Dandrifosse, Alexis Carlier, Benoit Mercatoris, Ken Kuroki, Haozhou Wang, Masanori Ishii, Minhajul A. Badhon, Curtis Pozniak, David Shaner LeBauer, Morten Lilimo, Jesse Poland, Scott Chapman, Benoit de Solan, Frédéric Baret, Ian Stavness, Wei Guo
We now release a new version of the Global Wheat Head Detection (GWHD) dataset in 2021, which is bigger, more diverse, and less noisy than the 2020 version.
no code implementations • NAACL 2021 • Haozhou Wang, James Henderson, Paola Merlo
Generative adversarial networks (GANs) have succeeded in inducing cross-lingual word embeddings -- maps of matching words across languages -- without supervision.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +1
no code implementations • IJCNLP 2019 • Haozhou Wang, James Henderson, Paola Merlo
Distributed representations of words which map each word to a continuous vector have proven useful in capturing important linguistic information not only in a single language but also across different languages.
no code implementations • CONLL 2017 • Christophe Moor, Paola Merlo, James Henderson, Haozhou Wang
This paper describes the University of Geneva{'}s submission to the CoNLL 2017 shared task Multilingual Parsing from Raw Text to Universal Dependencies (listed as the CLCL (Geneva) entry).
no code implementations • WS 2016 • Haozhou Wang, Paola Merlo
Traditional machine translation evaluation metrics such as BLEU and WER have been widely used, but these metrics have poor correlations with human judgements because they badly represent word similarity and impose strict identity matching.