From: Evaluating word embeddings and a revised corpus for part-of-speech tagging in Portuguese
Input features | Only words | Capitalization | Prefix + suffix | All three | ||||
---|---|---|---|---|---|---|---|---|
All (%) | OOV (%) | All (%) | OOV (%) | All (%) | OOV (%) | All (%) | OOV (%) | |
Random | 94.27 | 69.18 | 96.06 | 81.55 | 95.49 | 84.02 | 96.93 | 91.62 |
HAL | 95.12 | 74.97 | 96.61 | 84.61 | 95.79 | 85.49 | 97.10 | 91.60 |
NLM | 95.95 | 85.04 | 97.21 | 91.21 | 96.14 | 89.01 | 97.33 | 93.66 |
SG | 95.55 | 81.58 | 96.92 | 88.20 | 96.01 | 88.13 | 97.19 | 93.01 |