Association rules are an intuitive descriptive paradigm that has been used extensively in later years and in different application domains with the purpose to identify the regularities and correlation in a set of observed objects. However, recently, association rules’ statistical measures (support and confidence) have been criticized because in some cases have shown to fail their primary goal that is to select the most relevant and significant association rules. In this paper we propose a new model that replaces the support measure. The new model, like support, is a tool for the identification of the reliable rules and is used also to reduce the traversal of the itemsets search space. The proposed model adopts new criteria in order to establish the reliability of the information extracted from the database. These criteria are based on Bayes’ Theorem and on an estimate of the probability density function of each itemset. According to our criteria, the information that we have obtained from the database on an itemset is reliable if and only if the confidence interval of the estimated probability is low compared with the most likely value of it. We will see how this method can be computed in an approximated way, but satisfactory, with computational time comparable to the test on support threshold.

Replacing Support in Association Rule Mining

MEO, Rosa;IENCO, Dino
2009-01-01

Abstract

Association rules are an intuitive descriptive paradigm that has been used extensively in later years and in different application domains with the purpose to identify the regularities and correlation in a set of observed objects. However, recently, association rules’ statistical measures (support and confidence) have been criticized because in some cases have shown to fail their primary goal that is to select the most relevant and significant association rules. In this paper we propose a new model that replaces the support measure. The new model, like support, is a tool for the identification of the reliable rules and is used also to reduce the traversal of the itemsets search space. The proposed model adopts new criteria in order to establish the reliability of the information extracted from the database. These criteria are based on Bayes’ Theorem and on an estimate of the probability density function of each itemset. According to our criteria, the information that we have obtained from the database on an itemset is reliable if and only if the confidence interval of the estimated probability is low compared with the most likely value of it. We will see how this method can be computed in an approximated way, but satisfactory, with computational time comparable to the test on support threshold.
2009
Rare Association Rule Mining and Knowledge Discovery: Technologies for Infrequent and Critical Event Detection
Information Science Reference
Advances in Data Warehousing and Mining Book Series
-
33
46
1605667544
9781605667546
http://www.igi-global.com/reference/details.asp?ID=34650
itemsets; minimum support; rare patterns; Bayes Therorem
Meo, Rosa; Ienco, Dino
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/49142
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact