Skip to main content

Table 14 Most common wrongly tagged tokens and the tags they have in Mac-Morpho v3 test data

From: Evaluating word embeddings and a revised corpus for part-of-speech tagging in Portuguese

Token

Tags

Times

que

ADV, PDEN, PROSUB, NPROP, ADV-KS, PRO-KS,

383

 

KS, PROADJ, ADJ

 

a

ADV, KC, PROSUB, NPROP, PROPESS, KS, PROADJ,

161

 

IN, ART, PREP

 

o

ADV, PROSUB, NPROP, PROPESS, PRO-KS, N, KS, ART

137

como

ADV-KS, ADV, NPROP, KC, KS, IN, PREP

127

de

ADV, PDEN, NPROP, N, KS, PREP

103

um

ADV, PROSUB, ART, N, PROADJ, NUM, NPROP

68

até

ADV, PDEN, KS, PREP

60

uma

ADV, PROSUB, NPROP, N, NUM, IN, ART

59

ao

ADV, PDEN, PREP+PRO-KS, NPROP, PREP+PROSUB,

59

 

PREP+ART, PREP

 

mais

ADV, KC, PROSUB, NPROP, KS, PROADJ, PREP

52