Incremental extraction of association rules in applicative domains

Gallo, A.; Esposito, Roberto; Meo, Rosa; Botta, Marco

doi:10.1080/08839510701252486

In recent years, the KDD process has been advocated to be an iterative and interactive process. It is seldom the case that a user is able to answer immediately all his questions on date with a single query. On the contrary, the work-flow of the typical user consists of several steps in which he/she iteratively refines the extracted knowledge by inspecting previous results and posing new queries. Given this view of the KDD process, in order to reduce the computational effort, it becomes crucial to have KDD systems that are able to exploit past results. This is especially true in environments in which the system knowledge base is the result of many discoveries on data made separately by the collaborative effort of different users. In this paper, we consider the problem of mining frequent association rules from database relations. We first model a general, constraint-based, mining language for this task. Then, we propose an algorithm that answers such queries reusing past results. In particular, this solution is effective for a new class of constraints, called context dependent, which are more difficult than the traditionally studied item dependent constraints. Nevertheless, we show that some typical queries of important application domains, such as market stock trading, analysis of web log, and gene microarrays in bioinformatics, have context-dependent constraints. We show with a set of experiments in these application domains that the proposed solution with an incremental approach is both effective and viable.