Show simple item record

dc.contributor.authorTan, Henry
dc.contributor.authorHadzic, Fedja
dc.contributor.authorDillon, Tharam S.
dc.contributor.authorChang, Elizabeth
dc.contributor.authorFeng, Ling
dc.contributor.authorFeng, L.
dc.date.accessioned2017-01-30T11:45:31Z
dc.date.available2017-01-30T11:45:31Z
dc.date.created2009-02-19T18:01:55Z
dc.date.issued2008
dc.identifier.citationTan, Henry and Hadzic, Fedja and Dillon, Tharam and Chang, Elizabeth and Feng, Ling. 2008. Tree model guided candidate generation for mining frequent subtrees from XML. ACM Transactions on Knowledge Discovery from Data 2 (2): pp. 1-43.
dc.identifier.urihttp://hdl.handle.net/20.500.11937/14717
dc.description.abstract

Due to the inherent flexibilities in both structure and semantics, XML association rules mining faces few challenges, such as: a more complicated hierarchical data structure and ordered data context. Mining frequent patterns from XML documents can be recast as mining frequent tree structures from a database of XML documents. In this study, we model a database of XML documents as a database of rooted labeled ordered subtrees. In particular, we are mainly coneerned with mining frequent induced and embedded ordered subtrees. Our main contributions arc as follows. We describe our unique embedding list representation of the tree structure, which enables efficient implementation ofour Tree Model Guided (TMG) candidate generation. TMG is an optimal, non-redundant enumeration strategy which enumerates all the valid candidates that conform to the structural aspects of the data. We show through a mathematical model and experiments that TMG has better complexity compared to the commonly used join approach. In this paper, we propose two algorithms, MB3Miner and iMB3-Miner. MB3-Miner mines embedded subtrees. iMB3-Miner mines induced and/or embedded subtrees by using the maximum level of embedding constraint. Our experiments with both synthetic and real datasets against two well known algorithms for mining induced and embedded subtrees, demonstrate the effeetiveness and the efficiency of the proposed techniques.

dc.publisherACM
dc.relation.urihttp://doi.acm.org/10.1145/1376815.1376818
dc.subjectFREQT
dc.subjectTreeMiner
dc.subjectTree Model Guided
dc.subjectTMG
dc.subjectTree Mining
dc.titleTree model guided candidate generation for mining frequent subtrees from XML
dc.typeJournal Article
dcterms.source.volume2
dcterms.source.number2
dcterms.source.startPage1
dcterms.source.endPage43
dcterms.source.issn15564681
dcterms.source.titleACM Transactions on Knowledge Discovery from Data
curtin.note

© ACM, 2008. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Knowledge Discovery from Data, {VOL 2, ISSN 15564681, (2008)} http://doi.acm.org/10.1145/1376815.1376818

curtin.accessStatusOpen access
curtin.facultyCurtin Business School
curtin.facultyCentre for Extended Enterprises and Business Intelligence


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record