In this paper we propose and test the use of hierarchical clustering for feature selection. The clustering method is Ward's with a distance measure based on Goodman-Kruskal tau.We motivate the choice of this measure and compare it with other ones.Our hierarchical clustering is applied to over 40 data-sets from UCI archive. The proposed approach is interesting from many viewpoints.First, it produces the feature subsets dendrogram which serves as a valuable tool to study relevance relationships among features.Secondarily, the dendrogram is used in a feature selection algorithm to select the best features by a wrapper method.Experiments were run with three different families of classifiers: Naive Bayes, decision trees and k nearest neighbours.Our method allows all the three classifiers to generally outperform their corresponding ones without feature selection.We compare our feature selection with other state-of-the-art methods, obtaining on average a better classification accuracy, though obtaining a lower reduction in the number of features. Moreover, differently from other approaches for feature selection, our method does not require any parameter tuning.
Exploration and Reduction of the Feature Space by Hierarchical Clustering
IENCO, Dino;MEO, Rosa
2008-01-01
Abstract
In this paper we propose and test the use of hierarchical clustering for feature selection. The clustering method is Ward's with a distance measure based on Goodman-Kruskal tau.We motivate the choice of this measure and compare it with other ones.Our hierarchical clustering is applied to over 40 data-sets from UCI archive. The proposed approach is interesting from many viewpoints.First, it produces the feature subsets dendrogram which serves as a valuable tool to study relevance relationships among features.Secondarily, the dendrogram is used in a feature selection algorithm to select the best features by a wrapper method.Experiments were run with three different families of classifiers: Naive Bayes, decision trees and k nearest neighbours.Our method allows all the three classifiers to generally outperform their corresponding ones without feature selection.We compare our feature selection with other state-of-the-art methods, obtaining on average a better classification accuracy, though obtaining a lower reduction in the number of features. Moreover, differently from other approaches for feature selection, our method does not require any parameter tuning.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.