Generating and Scoring Correction Candidates in Chinese Grammatical Error Diagnosis

WS 2016 · Shao-Heng Chen, Yu-Lin Tsai, Chuan-Jie Lin ·

Grammatical error diagnosis is an essential part in a language-learning tutoring system. Based on the data sets of Chinese grammar error detection tasks, we proposed a system which measures the likelihood of correction candidates generated by deleting or inserting characters or words, moving substrings to different positions, substituting prepositions with other prepositions, or substituting words with their synonyms or similar strings. Sentence likelihood is measured based on the frequencies of substrings from the space-removed version of Google n-grams. The evaluation on the training set shows that Missing-related and Selection-related candidate generation methods have promising performance. Our final system achieved a precision of 30.28{\%} and a recall of 62.85{\%} in the identification level evaluated on the test set.

PDF Abstract