Integrating Spatial Data Sets Using Road Networks from Heterogeneous and Autonomous Data Sets
Spatial database integration is defined as the process of identifying the corresponding features from different sources and integrating them into a unified database. The structure of the spatial database depends upon an organization’s needs. To develop an efficient Spatial Data Infrastructure (SDI), several organizations may need to share the existing data among themselves instead of duplicating the data. Hence, there arises a need for spatial integration of databases to make the data interoperable or to share the data between different geographical information sources (Laurini, 1993), where the data from different sources can be accessed as if a single, unified source.
The data typically differ in the way they have been captured and stored. They mostly do not have a uniform scale, format, semantics or data model. This heterogeneous state of the data means that the integration of different data sets results in ambiguous features. Therefore, the integration problem is not solved by doing a simple spatial overlay or merge operation. Deveogele et al. (1998), define spatial database integration as the process of integrating more than one heterogeneous and autonomous spatial data set into a single unified description of reality. Uitermark et al. (1999) have defined spatial database integration as the process of identifying the corresponding objects and establishing a relationship between these corresponding objects.
This research addresses spatial database integration from heterogeneous and autonomous spatial databases with special emphasis on merging unambiguous
features into a unified database. The research focuses on integrating linear features from different databases. The main reason for considering linear feature is that roads are man made features, which undergo frequent changes, such as up-grading of Street to Road, Road to Highway. Deveogele et al., (1998); Deveogele (2002); Walter and Fritch, (1999) have also emphasized procedures for linear feature integration.
The data sets used in this work differ in scale, data model, format and the semantics. The research proposes an algorithm called Format, Sharing, Identification and Integration (FSII) to integrate linear features. The FSII algorithm consists of four stages. The first stage brings data to the compatible format. The second stage shares data logically between sources and federation. The third stage identifies the potential matching features on the basis of their geometry matching. The best matching features are then identified on the basis of their semantic correspondence; the fourth stage integrates the corresponding and non-corresponding features in a unified federated data set. The federated database technique is used to share data between sources and the federation logically to avoid data redundancy and inconsistency.