From: Evaluating word embeddings and a revised corpus for part-of-speech tagging in Portuguese
Input features | Only words | Capitalization | Prefix + suffix | All three | ||||
---|---|---|---|---|---|---|---|---|
All (%) | OOV (%) | All (%) | OOV (%) | All (%) | OOV (%) | All (%) | OOV (%) | |
Random | 94.99 | 74.46 | 96.50 | 83.72 | 96.04 | 86.37 | 97.25 | 93.40 |
HAL | 95.61 | 77.81 | 96.94 | 85.95 | 96.25 | 87.35 | 97.29 | 92.91 |
NLM | 96.31 | 85.64 | 97.32 | 91.16 | 96.46 | 90.04 | 97.48 | 94.34 |
SG | 95.89 | 82.56 | 97.14 | 88.64 | 96.41 | 89.17 | 97.35 | 93.61 |