Extracting geographic features from the Internet to automatically build detailed regional gazetteers
Authors:
Daniel W. Goldberg a;
John P. Wilson b;
Craig A. Knoblock c
| Affiliations: | a Department of Computer Science, University of Southern California, Los Angeles, CA 90089-0255, USA |
| b Department of Geography, University of Southern California, Los Angeles, CA 90089-0255, USA | |
| c Department of Computer Science, University of Southern California, Marina del Rey, CA 90292, USA |
DOI:
10.1080/13658810802577262
Publication Frequency:
12 issues per year
Published in:
International Journal of Geographical Information Science,
Volume
23,
Issue
1
January
2009
, pages 93
- 128
Subjects:
Cartography;
Computer Science (General);
Earth Sciences;
Geographic Information Systems;
Location Based Services;
Navigation;
Systems & Computer Architecture of Databases;
Topography;
Transport Geography;
Formats available:
HTML
(English)
:
PDF
(English)
Previously published as:
International journal of geographical information systems
(0269-3798,
1362-3087)
until 1996
View Article:
View Article (PDF)
View Article (HTML)
Abstract
The utility of every imaginable application which incorporates a gazetteer hinges on the simple fact that the resulting system will only be as useful, complete, or accurate as the underlying gazetteer itself. A major issue confronting gazetteers utilized in systems today is that they are not complete and measures of their accuracy are largely unknown. In this paper we describe a methodology which addresses this problem by automatically generating highly complete and detailed regional gazetteers from Internet sources. We utilize information extraction and integration techniques to automatically obtain geographic features and associated footprints and feature types from freely and widely available online data which could be applied to create a gazetteer for nearly any area. We discuss the distinguishing characteristics of the generated gazetteer and extend previous work to define measures which can be used to assess the completeness and accuracy of gazetteers. Using these measures, the generated gazetteer is evaluated against the Alexandria Digital Library Gazetteer and the Los Angeles Comprehensive Bibliographic Database. Our results indicate that a gazetteer created by our methods will be at least as complete as any gazetteer currently available for certain feature classes, while falling short in others. We conclude by offering suggestions to address these shortcomings.
|
| Keywords: Gazetteers; Geographic information extraction |
| view references (108) |

Download Citation


CiteULike
Del.icio.us
BibSonomy
Connotea