Sue Hope

Integration of Vector Datasets

SueHope sq
University of Melbourne
Supervisor (Academic)
Dr Allison Kealy, University of Melbourne
Supervisor (Industry)
Geoff Menner, Logica CMG
Thesis Abstract

As the spatial information industry moves from an era of data collection to one of data maintenance, new integration methods to consolidate or to update datasets are required. These must reduce the discrepancies that are becoming increasingly apparent when spatial datasets are overlaid. It is essential that any such methods consider the quality characteristics of, firstly, the data being integrated and, secondly, the resultant data. This thesis develops techniques that give due consideration to data quality during the integration process.

Methods to integrate vector datasets have been developed within the two spatial science domains of GIS and surveying. Techniques developed within the GIS realm tend to follow the seminal conflation approach of Saalfeld (1988). Although such methods aim to align corresponding features across datasets, they suffer a number of limitations, particularly with regards to the consideration given to data quality. In contrast, surveyors have taken a least squares-based approach to data integration. Typically applied to the positional accuracy improvement of legacy cadastral databases, this approach determines a rigorous positioning solution. It takes into account the positions, and associated accuracies, of both datasets and enables geometric constraints, such as collinearity, to be formulated as additional observations. Updated measures of the quality of the resultant data are also provided. However, least square-based approaches are restricted in their current application to datasets containing features with well-defined vertices. They are also limited in terms of the types of spatial integrity constraints that they can preserve.

This research addresses these limitations by developing techniques to enhance the least squares-based approach to data integration. Firstly, a case study is established to assess how well the quality measures output from the integration process do model the positional accuracy of the resultant data. Secondly, using a novel point-matching method, a method is developed to derive observations across duplicate features that do not exhibit one-to-one vertex correspondence. This is used to underlie a feature-based data integration process that extends the application of least squares methods to datasets containing natural features.

Lastly, functional models are derived for a range of spatial integrity constraints, such as the disjoint topological relationship, that are not currently included in integration methods. As these are modelled as inequalities, an algorithm that enhances the standard least squares method to enable their inclusion within the data integration process is  developed. As a result, the relationships are preserved whilst the information that they contain is actually used to augment the system. 

The enhanced least squares-based data integration process developed within this research is able to use all of the available information in determining the most probable positioning solution when vector datasets are overlaid. This includes the positions, and associated accuracies, of all features and any defined spatial relationships. The positional accuracy of databases can be improved at the same time as spatial integrity of the data is preserved. Furthermore, the process returns measures of the precision of the resultant data at the level of the individual coordinate, offering detailed information regarding the quality of the integrated datasets.