Library

Showing 1-1 of 1

By Alan Woodley

Using Parallel Hierarchical Clustering to Address Spatial Big Data Challenges

Dec
16
Publication Type
Journal paper
State/Country
Australia
Other Authors
Ling-Xiang Tang, Shlomo Geva, Richi Nayak and Timothy Chappell
CRC Contact
QUT and CRCSI
Description

Abstract — Clustering can help to make large datasets more manageable by grouping together similar objects. However, most clustering approaches are unable to scale to very large datasets (eg. more than 10 million objects). The K-Tree is a data structure and clustering algorithm that has proven to be scalable with large streaming datasets. Here, we apply the K-Tree to spatial data (satellite images) and extend from a single threaded to a multicore environment. We show that the K-Tree is able to cluster larger datasets more efficiently than baseline approaches.

Download