From: Evaluation of cutoff policies for term extraction
Corpus
Texts
Sentences
Words
Terms
Ped
281
27,724
835,412
180,120
SM
88
44,222
1,173,401
252,168
DM
53
42,932
1,127,816
244,439
PP
62
40,928
1,086,771
241,145
Geo
234
69,461
2,010,527
436,401