Skip to main content

Table 1 Corpora characteristics

From: Evaluation of cutoff policies for term extraction

Corpus

Texts

Sentences

Words

Terms

Ped

281

27,724

835,412

180,120

SM

88

44,222

1,173,401

252,168

DM

53

42,932

1,127,816

244,439

PP

62

40,928

1,086,771

241,145

Geo

234

69,461

2,010,527

436,401