Curtin University Homepage
  • Library
  • Help
    • Admin

    espace - Curtin’s institutional repository

    JavaScript is disabled for your browser. Some features of this site may not work without it.
    View Item 
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item
    • espace Home
    • espace
    • Curtin Research Publications
    • View Item

    Mining Induced/Embedded Subtrees using the Level of Embedding Constraint

    Access Status
    Fulltext not available
    Authors
    Tan, H.
    Hadzic, Fedja
    Dillon, T.
    Date
    2012
    Type
    Journal Article
    
    Metadata
    Show full item record
    Citation
    Tan, H. and Hadzic, F. and Dillon, T. 2012. Mining Induced/Embedded Subtrees using the Level of Embedding Constraint. Fundamenta Informaticae. 119 (2): pp. 187-231.
    Source Title
    Fundamenta Informaticae
    DOI
    10.3233/FI-2012-733
    ISSN
    01692968
    School
    Department of Computing
    URI
    http://hdl.handle.net/20.500.11937/34877
    Collection
    • Curtin Research Publications
    Abstract

    The increasing need for representing information through more complex structures where semantics and relationships among data objects can be more easily expressed has resulted in many semi-structured data sources. Structure comparison among semi-structured data objects can often reveal valuable information, and hence tree mining has gained a considerable amount of interest in areas such as XML mining, Bioinformatics, Web mining etc. We are primarily concerned with the task of mining frequent ordered induced and embedded subtrees from a database of rooted ordered labelled trees. Our previous contributions consist of the efficient Tree Model Guided (TMG) candidate enumeration approach for which we developed a mathematical model that provides an estimate of the worst case complexity for embedded subtree mining. This potentially reveals computationally impractical situations where one would be forced to constrain the mining process in some way so that at least some patterns can be discovered. This motivated our strategy of tackling the complexity of mining embedded subtrees by introducing the Level of Embedding constraint.Thus, when it is too costly to mine all frequent embedded subtrees, one can decrease the level of embedding constraint gradually down to 1, from which all the obtained frequent subtrees are induced subtrees. In this paper we develop alternative implementations and propose two algorithms MB3-R and iMB3-R, which achieve better efficiency in terms of time and space. Furthermore, we develop a mathematical model for estimating the worst case complexity for induced subtree mining. It is accompanied with a theoretical analysis of induced-embedded subtree relationships in terms of complexity for frequent subtree mining. Using synthetic and real world data we practically demonstrate the space and time efficiency of our new approach and provide some comparisons to the two well know algorithms for mining induced and embedded subtrees.

    Related items

    Showing items related by title, author, creator and subject.

    • Tree model guided candidate generation for mining frequent subtrees from XML
      Tan, Henry; Hadzic, Fedja; Dillon, Tharam S.; Chang, Elizabeth; Feng, Ling; Feng, L. (2008)
      Due to the inherent flexibilities in both structure and semantics, XML association rules mining faces few challenges, such as: a more complicated hierarchical data structure and ordered data context. Mining frequent ...
    • Quality and interestingness of association rules derived from data mining of relational and semi-structured data
      Mohd Shaharanee, Izwan Nizal (2012)
      Deriving useful and interesting rules from a data mining system are essential and important tasks. Problems such as the discovery of random and coincidental patterns or patterns with no significant values, and the generation ...
    • IMB3-Miner: Mining induced/embedded subtrees by constraining the level of embedding
      Tan, H.; Dillon, Tharam S.; Hadzic, Fedja; Chang, Elizabeth; Feng, L. (2006)
      Tree mining has recently attracted a lot of interest in areas such as Bioinformatics, XML mining, Web mining, etc. We are mainly concerned with mining frequent induced and embedded subtrees. While more interesting patterns ...
    Advanced search

    Browse

    Communities & CollectionsIssue DateAuthorTitleSubjectDocument TypeThis CollectionIssue DateAuthorTitleSubjectDocument Type

    My Account

    Admin

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Follow Curtin

    • 
    • 
    • 
    • 
    • 

    CRICOS Provider Code: 00301JABN: 99 143 842 569TEQSA: PRV12158

    Copyright | Disclaimer | Privacy statement | Accessibility

    Curtin would like to pay respect to the Aboriginal and Torres Strait Islander members of our community by acknowledging the traditional owners of the land on which the Perth campus is located, the Whadjuk people of the Nyungar Nation; and on our Kalgoorlie campus, the Wongutha people of the North-Eastern Goldfields.