Instance-based classifiers that compute similarity between instances suffer from the presence of noise in the training set and from overfitting. In this paper we propose a new type of distancebased classifier that instead of computing distances between instances computes the distance between each test instance and the classes. Both the test instance and the classes are represented by patterns in the space of the frequent itemsets. We ranked the itemsets by metrics of itemset significance. Then we considered only the top portion of the ranking that leads the classifier to reach the maximum accuracy. We have experimented on a large collection of datasets from UCI archive with different proximity measures and different metrics of itemsets ranking. We show that our method has many benefits: it reduces the number of distance computations, improves the classification accuracy of state-of-the art classifiers, like decision trees, SVM, knn, Naive Bayes, rule-based classifiers and association rule-based ones and outperforms the competitors especially on noise data.

A Novel Distance-Based Classifier Built on Pattern Ranking

MEO, Rosa
2009-01-01

Abstract

Instance-based classifiers that compute similarity between instances suffer from the presence of noise in the training set and from overfitting. In this paper we propose a new type of distancebased classifier that instead of computing distances between instances computes the distance between each test instance and the classes. Both the test instance and the classes are represented by patterns in the space of the frequent itemsets. We ranked the itemsets by metrics of itemset significance. Then we considered only the top portion of the ranking that leads the classifier to reach the maximum accuracy. We have experimented on a large collection of datasets from UCI archive with different proximity measures and different metrics of itemsets ranking. We show that our method has many benefits: it reduces the number of distance computations, improves the classification accuracy of state-of-the art classifiers, like decision trees, SVM, knn, Naive Bayes, rule-based classifiers and association rule-based ones and outperforms the competitors especially on noise data.
2009
24th ACM Symposium on Applied Computing
Honolulu, Hawaii, USA
March, 2009
Proceedings of 24th ACM Symposium on Applied Computing
ACM
3
1427
1432
9781605581668
http://www.acm.org/conferences/sac/sac2009/
instance-base learning; frequent itemsets; pattern ranking; associative classifiers
D. BACHAR; R. MEO
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/50654
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact