M. Wallace, S. Kollias |
Robust, Generalized, Quick and Efficient Agglomerative Clustering |
Proceedings of 6th International Conference on Enterprise Information Systems, Porto, Portugal, April 2004 |
ABSTRACT
|
Hierarchical approaches, which are dominated by the generic agglomerative clustering algorithm, are suitable for cases in which the count of distinct clusters in the data is not known a priori; this is not a rare case in real data. On the other hand, important problems are related to their application, such as susceptibility to errors in the initial steps that propagate all the way to the final output and high complexity. Finally, similarly to all other clustering techniques, their efficiency decreases as the dimensionality of their input increases. In this paper we propose a robust, generalized, quick and efficient extension to the generic agglomerative clustering process. Robust refers to the proposed approach¢s ability to overcome the classic algorithm¢s susceptibility to errors in the initial steps, generalized to its ability to simultaneously consider multiple distance metrics, quick to its suitability for application to larger datasets via the application of the computationally expensive components to only a subset of the available data samples and efficient to its ability to produce results that are comparable to those of trained classifiers, largely outperforming the generic agglomerative process.
|
14 April , 2004 |
M. Wallace, S. Kollias, "Robust, Generalized, Quick and Efficient Agglomerative Clustering", Proceedings of 6th International Conference on Enterprise Information Systems, Porto, Portugal, April 2004 |
[ PDF] [
BibTex] [
Print] [
Back] |