From: Evaluating word embeddings and a revised corpus for part-of-speech tagging in Portuguese
Input features | Only words | Capitalization | Prefix + suffix | All three | ||||
---|---|---|---|---|---|---|---|---|
All (%) | OOV (%) | All (%) | OOV (%) | All (%) | OOV (%) | All (%) | OOV (%) | |
Random | 95.10 | 70.75 | 96.65 | 84.58 | 96.12 | 83.64 | 97.33 | 92.37 |
HAL | 95.84 | 78.14 | 97.04 | 86.87 | 96.36 | 86.35 | 97.41 | 92.34 |
NLM | 96.34 | 85.68 | 97.44 | 91.71 | 96.56 | 88.62 | 97.57 | 93.38 |
SG | 96.10 | 82.83 | 97.24 | 89.99 | 96.47 | 87.70 | 97.44 | 93.32 |