Curtin University Homepage
  • Library
  • Help
    • Admin

    espace - Curtin’s institutional repository

    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item

    Evaluation and optimization of frequent, closed and maximal association rule based classification.

    Access Status
    Fulltext not available
    Authors
    Shaharanee, I.
    Hadzic, Fedja
    Date
    2014
    Type
    Journal Article
    
    Metadata
    Show full item record
    Citation
    Shaharanee, I. and Hadzic, F. 2014. Evaluation and optimization of frequent, closed and maximal association rule based classification.. Statistics and Computing. 24 (5): pp. 821-843.
    Source Title
    Statistics and Computing
    DOI
    10.1007/s11222-013-9404-6
    ISSN
    09603174
    School
    Department of Computing
    URI
    http://hdl.handle.net/20.500.11937/34009
    Collection
    • Curtin Research Publications
    Abstract

    Real world applications of association rule mining have well-known problems of discovering a large number of rules, many of which are not interesting or useful for the application at hand. The algorithms for closed and maximal itemsets mining significantly reduce the volume of rules discovered and complexity associated with the task, but the implications of their use and important differences with respect to the generalization power, precision and recall when used in the classification problem have not been examined. In this paper, we present a systematic evaluation of the association rules discovered from frequent, closed and maximal itemset mining algorithms, combining common data mining and statistical interestingness measures, and outline an appropriate sequence of usage. The experiments are performed using a number of real-world datasets that represent diverse characteristics of data/items, and detailed evaluation of rule sets is provided as a whole and w.r.t individual classes. Empirical results confirm that with a proper combination of data mining and statistical analysis, a large number of non-significant, redundant and contradictive rules can be eliminated while preserving relatively high precision and recall.More importantly, the results reveal the important characteristics and differences between using frequent, closed and maximal itemsets for the classification task, and the effect of incorporating statistical/heuristic measures for optimizing such rule sets. With closed itemset mining already being a preferred choice for complexity and redundancy reduction during rule generation, this study has further confirmed that overall closed itemset based association rules are also of better quality in terms of classification precision and recall, and precision and recall on individual class examples. On the other hand maximal itemset based association rules, that are a subset of closed itemset based rules, show to be insufficient in this regard, and typically will have worse recall and generalization power. Empirical results also show the downfall of using the confidence measure at the start to generate association rules, as typically done within the association rule framework. Removing rules that occur below a certain confidence threshold, will also remove the knowledge of existence of any contradictions in the data to the relatively higher confidence rules, and thus precision can be increased by disregarding contradictive rules prior to application of confidence constraint.

    Related items

    Showing items related by title, author, creator and subject.

    • Quality and interestingness of association rules derived from data mining of relational and semi-structured data
      Mohd Shaharanee, Izwan Nizal (2012)
      Deriving useful and interesting rules from a data mining system are essential and important tasks. Problems such as the discovery of random and coincidental patterns or patterns with no significant values, and the generation ...
    • X3-Miner: Mining patterns from XML database
      Chang, Elizabeth; Tan, H.; Dillon, Tharam S.; Feng, L.; Hadzic, Fedja (2005)
      An XML enabled framework for representation of association rules in databases was first presented in [Feng03]. In Frequent Structure Mining (FSM), there are techniques proposed to mine frequent patterns from complex trees ...
    • Techniques for improving clustering and association rules mining from very large transactional databases
      Li, Yanrong (2009)
      Clustering and association rules mining are two core data mining tasks that have been actively studied by data mining community for nearly two decades. Though many clustering and association rules mining algorithms have ...
    Advanced search

    Browse

    Communities & CollectionsIssue DateAuthorTitleSubjectDocument TypeThis CollectionIssue DateAuthorTitleSubjectDocument Type

    My Account

    Admin

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Follow Curtin

    • 
    • 
    • 
    • 
    • 

    CRICOS Provider Code: 00301JABN: 99 143 842 569TEQSA: PRV12158

    Copyright | Disclaimer | Privacy statement | Accessibility

    Curtin would like to pay respect to the Aboriginal and Torres Strait Islander members of our community by acknowledging the traditional owners of the land on which the Perth campus is located, the Whadjuk people of the Nyungar Nation; and on our Kalgoorlie campus, the Wongutha people of the North-Eastern Goldfields.