Open Access

scriptLattes: an open-source knowledge extraction system from the Lattes platform

  • Jesús Pascual Mena-Chalco1 and
  • Roberto Marcondes Cesar Junior1
Journal of the Brazilian Computer Society15:BF03194511

DOI: 10.1007/BF03194511

Received: 7 May 2009

Accepted: 15 December 2009

Abstract

The Lattes platform is the major scientific information system maintained by the National Council for Scientific and Technological Development (CNPq). This platform allows to manage the curricular information of researchers and institutions working in Brazil based on the so called Lattes Curriculum. However, the public information is individually available for each researcher, not providing the automatic creation of reports of several scientific productions for research groups. It is thus difficult to extract and to summarize useful knowledge for medium to large size groups of researchers. This paper describes the design, implementation and experiences with scriptLattes: an open-source system to create academic reports of groups based on curricula of the Lattes Database. The scriptLattes system is composed by the following modules: (a) data selection, (b) data preprocessing, (c) redundancy treatment, (d) collaboration graph generation among group members, (e) research map generation based on geographical information, and (f) automatic report creation of bibliographical, technical and artistic production, and academic supervisions. The system has been extensively tested for a large variety of research groups of Brazilian institutions, and the generated reports have shown an alternative to easily extract knowledge from data in the context of Lattes platform. The source code, usage instructions and examples are available at http://scriptlattes.sourceforge.net/.