Skip to main content

An ontological gazetteer and its application for place name disambiguation in text

Abstract

The volume of spatial information on the Web grows daily, both in the form of online maps and as references to places embedded in documents and pages. Considering the spatial information needs of users, it is often necessary to recognize, within a document’s text, the places to which it refers. This article presents a next-generation gazetteer, a toponymic dictionary which expands from the traditional cataloguing of place names and includes geographic elements such as spatial relationships, concepts and terms related to places. As such, we call it an OntoGazetteer, i.e., a gazetteer which also records semantic connections among places. The ontological gazetteer provides factual and semantic support to solving several common problems in geographic information retrieval. This paper presents the OntoGazetteer and demonstrates its applicability to a place name disambiguation problem. Along with other problem solutions to which the OntoGazetteer can contribute, we present a case study on recognizing and disambiguating place names within news sources.

References

  1. 1.

    Abdelmoty AI, Smart P, Jones CB (2007) Building place ontologies for the semantic web: issues and approaches. In: Proceedings of the 4th ACM workshop on geographical information retrieval, GIR’07. ACM, New York, pp 7–12

    Chapter  Google Scholar 

  2. 2.

    Adriani M, Paramita ML (2007) Identifying location in Indonesian documents for geographic information retrieval. In: Proceedings of the 4th ACM workshop on geographical information retrieval, GIR’07. ACM, New York, pp 19–24

    Chapter  Google Scholar 

  3. 3.

    Alencar RO, Davis CA Jr, Gonçalves MA (2010) Geographical classification of documents using evidence from Wikipedia. In: Proceedings of the 6th workshop on geographic information retrieval, GIR’10. ACM, New York, pp 12:1–12:8

    Google Scholar 

  4. 4.

    Alencar RO Davis CA Jr (2011) Geotagging aided by topic detection with Wikipedia. In: Geertman S, Reinhardt W, Toppen F (eds) Advancing geoinformation science for a changing world. Lecture notes in geoinformation and cartography, vol 1. Springer, Berlin, pp 461–477

    Chapter  Google Scholar 

  5. 5.

    Amitay E, Har’El N, Sivan R, Soffer A (2004) Web-a-where: geotagging web content. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR’04. ACM, New York, pp 273–280

    Chapter  Google Scholar 

  6. 6.

    Backstrom L, Kleinberg J, Kumar R, Novak J (2008) Spatial variation in search engine queries. In: Proceeding of the 17th international conference on World Wide Web, WWW’08. ACM, New York, pp 357–366

    Chapter  Google Scholar 

  7. 7.

    Borges KA, Davis CA, Laender AH (2001) Omt-g: an object-oriented data model for geographic applications. GeoInformatica 5:221–260. doi:10.1023/A:1011482030093

    Article  Google Scholar 

  8. 8.

    Borges KAV, Davis CA Jr, Laender AHF, Medeiros CB (2011) Ontology-driven discovery of geospatial evidence in web pages. GeoInformatica, to appear (available as OnlineFirst). doi:10.1007/s10707-010-0118-z

  9. 9.

    Borges KAV, Laender AHF, Medeiros CB, Davis CA Jr (2007) Discovering geographic locations in web pages using urban addresses. In: Proceedings of the 4th ACM workshop on geographical information retrieval, GIR’07. ACM, New York, pp 31–36

    Chapter  Google Scholar 

  10. 10.

    Breitman K (2006) Web semântica: a Internet do futuro, 1st edn. LTC Editora, Rio de Janeiro

    Google Scholar 

  11. 11.

    Davis CA Jr, Laender AHF (1999) Multiple representations in GIS: materialization through map generalization, geometric, and spatial analysis operations. In: Proceedings of the 7th ACM international symposium on advances in geographic information systems, GIS’99. ACM, New York, pp 60–65

    Google Scholar 

  12. 12.

    Delboni TM, Borges KA, Laender AH, Davis CA (2007) Semantic expansion of geographic web queries based on natural language positioning expressions. Trans GIS 11(3):377–397

    Article  Google Scholar 

  13. 13.

    Fu G, Jones CB, Abdelmoty AI (2005) Ontology-based spatial query expansion in information retrieval. In: Meersman R, Tari Z (eds) Proceedings of the OTM confederated international conferences, vol 3761. Springer, Berlin

    Google Scholar 

  14. 14.

    Goodchild MF, Hill LL (2008) Introduction to digital gazetteer research. Int J Geogr Inf Sci 22(10):1039–1044

    Article  Google Scholar 

  15. 15.

    Gouvêa C, Loh S, Garcia LFF, Fonseca EB, Wendt I (2008) Discovering location indicators of toponyms from news to improve gazetteer-based geo-referencing. In: Proceedings of the simpósio brasileiro de geoinformática, GEOINFO 2008. SBC, Porto Alegre

    Google Scholar 

  16. 16.

    Gruber T (2009) What is an ontology? http://www-ksl.stanford.edu/kst/what-is-an-ontology.html

  17. 17.

    Hill LL (2000) Core elements of digital gazetteers: placenames, categories, and footprints. In: Proceedings of the 4th European conference on research and advanced technology for digital libraries, ECDL’00. Springer, London, pp 280–290

    Chapter  Google Scholar 

  18. 18.

    Huang YQ, Deng GY (2009) Research on representation of geographic spatio-temporal information and spatio-temporal reasoning rules based on geo-ontology and SWRL. In: International conference on environmental science and information application technology proceedings, vol. 3, pp 381–384

    Google Scholar 

  19. 19.

    Janowicz K, Kessler C (2008) The role of ontology in improving gazetteer interaction. Int J Geogr Inf Sci 22:1129–1157

    Article  Google Scholar 

  20. 20.

    Jones CB, Alani H, Tudhope D (2001) Geographical information retrieval with ontologies of place. In: Proceedings of the international conference on spatial information theory: foundations of geographic information science, COSIT. Springer, London, pp 322–335.

    Chapter  Google Scholar 

  21. 21.

    Jones CB, Purves RS (2008) Geographical information retrieval. Int J Geogr Inf Sci 22:219–228

    Article  Google Scholar 

  22. 22.

    Leidner JL Towards a reference corpus for automatic toponym resolution evaluation. In: Proceedings of the geographic information retrieval (GIR) workshop held at the 27th annual international ACM SIGIR conference

  23. 23.

    Leidner JL (2008) Toponym resolution in text. Dissertation Com Publishers

  24. 24.

    Lopez-Pellicer FJ, Silva MJ, Chaves M (2010) Linkable geographic ontologies. In: Proceedings of the 6th workshop on geographic information retrieval, GIR’10, ACM, New York, pp 1:1–1:8

    Google Scholar 

  25. 25.

    Machado IM, Alencar RO, Campos R Jr, Davis CA Jr (2010) An ontological gazetteer for geographic information retrieval. In: Proceedings of the GeoINFO. SBC, Porto Alegre

    Google Scholar 

  26. 26.

    Machado IMR (2011) Um gazetteer ontológico para recuperação de informação geográfica. Master’s thesis, Departamento de Ciência da Computação da Universidade Federal de Minas Gerais

  27. 27.

    Maedche A, Staab S (2001) Ontology learning for the semantic web. IEEE Intell Syst 16:72–79

    Article  Google Scholar 

  28. 28.

    Overell SE, Rüger S (2007) Geographic co-occurrence as a tool for GIR. In: Proceedings of the 4th ACM workshop on geographical information retrieval, GIR’07. ACM, New York, pp 71–76

    Chapter  Google Scholar 

  29. 29.

    Overell SE, Stefan R (2006) Identifying and grounding descriptions of places. In: SIGIR Workshop on GIR, Seattle, Washington, pp 2–4

    Google Scholar 

  30. 30.

    Ping D, Yong L (2009) Building place name ontology to assist in geographic information retrieval. In: Proceedings of the 2009 international forum on computer science-technology and applications, vol 1, IFCSTA’09. IEEE Computer Society, Washington, pp 306–309

    Chapter  Google Scholar 

  31. 31.

    Popescu A, Grefenstette G, Moëllic PA (2008) Gazetiki: automatic creation of a geographical gazetteer. In: Proceedings of the 8th ACM/IEEE-CS joint conference on digital libraries, JCDL’08. ACM, New York, pp 85–93

    Google Scholar 

  32. 32.

    Rodrigues C, Chaves M (2006) Uma representação ontológica da geografia física de Portugal

    Google Scholar 

  33. 33.

    Sanderson M, Kohler J (2004) Analyzing geographic queries. In: Proceeding of the 2nd international workshop on geographic information retrieval, GIR’04. ACM, New York

    Google Scholar 

  34. 34.

    Silva MJ, Martins B, Chaves M, Afonso AP, Cardoso N (2006) Adding geographic scopes to web resources. Comput Environ Urban Syst 30:378–399

    Article  Google Scholar 

  35. 35.

    Smith DA, Crane G (2001) Disambiguating geographic names in a historical digital library. In: Proceedings of the 5th European conference on research and advanced technology for digital libraries, ECDL’01. Springer, London, pp 127–136

    Chapter  Google Scholar 

  36. 36.

    Souza LA, Davis CA Jr, Borges KAV, Delboni TM, Laender AHF (2005) The role of gazetteers in geographic knowledge discovery on the web. In: Proceedings of the third Latin American web congress. IEEE Computer Society, Washington, pp 157–158

    Chapter  Google Scholar 

  37. 37.

    Souza LA, Delboni TM, Borges KAV, Davis CA Jr, Laender AHF (2004) Locus: um localizador espacial urbano

    Google Scholar 

  38. 38.

    Teitler BE, Lieberman MD, Panozzo D, Sankaranarayanan J, Samet H, Sperling J (2008) Newsstand: a new view on news. In: Proceedings of the 16th ACM SIGSPATIAL international conference on advances in geographic information systems, GIS’08. ACM, New York, pp 18:1–18:10

    Google Scholar 

  39. 39.

    Toral A, Munoz R (2006) A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia. In: EACL 2006

    Google Scholar 

  40. 40.

    Uryupina O (2003) Semi-supervised learning of geographical gazetteers from the Internet. In: Proceedings of the HLT-NAACL 2003 workshop on analysis of geographic references, vol 1, HLT-NAACL-GEOREF’03. Association for Computational Linguistics, Strasbourg, pp 18–25

    Chapter  Google Scholar 

  41. 41.

    Volz R, Kleb J, Mueller W (2007) Towards ontology based disambiguation of geographical identifiers. In: WWW2007, Banff, Canada

    Google Scholar 

  42. 42.

    Wang L, Wang C, Xie X, Forman J, Lu Y, Ma WY, Li Y (2005) Detecting dominant locations from search queries. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR’05. ACM, New York, pp 424–431

    Chapter  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ivre Marjorie R. Machado.

Additional information

A previous version of this paper appeared at GEOINFO 2010, the Brazilian Symposium on Geoinformatics.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Machado, I.M.R., de Alencar, R.O., Campos, R.d.O. et al. An ontological gazetteer and its application for place name disambiguation in text. J Braz Comput Soc 17, 267–279 (2011). https://doi.org/10.1007/s13173-011-0044-4

Download citation

Keywords

  • Geographic information retrieval
  • Gazetteer
  • Spatial ontologies
  • Place name disambiguation