IMB3-Miner: Mining induced/embedded subtrees by constraining the level of embedding
Access Status
Authors
Date
2006Type
Metadata
Show full item recordCitation
Source Title
Source Conference
ISBN
Faculty
School
Remarks
The original publication is available at : www.springerlink.com
Collection
Abstract
Tree mining has recently attracted a lot of interest in areas such as Bioinformatics, XML mining, Web mining, etc. We are mainly concerned with mining frequent induced and embedded subtrees. While more interesting patterns can be obtained when mining embedded subtrees, unfortunately mining such embedding relationships can be very costly. In this paper, we propose an efficient approach to tackle the complexity of mining embedded subtrees by utilizing a novel Embedding List representation, Tree Model Guided enumeration, and introducing the Level of Embedding constraint. Thus, when it is too costly to mine all frequent embedded subtrees, one can decrease the level of embedding constraint gradually up to 1, from which all the obtained frequent subtrees are induced subtrees. Our experiments with both synthetic and real datasets against two known algorithms for mining induced and embedded subtrees, FREQT and TreeMiner, demonstrate the effectiveness and the efficiency of the technique.
Related items
Showing items related by title, author, creator and subject.
-
Tan, H.; Hadzic, Fedja; Dillon, T. (2012)The increasing need for representing information through more complex structures where semantics and relationships among data objects can be more easily expressed has resulted in many semi-structured data sources. Structure ...
-
Tan, Henry; Hadzic, Fedja; Dillon, Tharam S.; Chang, Elizabeth; Feng, Ling; Feng, L. (2008)Due to the inherent flexibilities in both structure and semantics, XML association rules mining faces few challenges, such as: a more complicated hierarchical data structure and ordered data context. Mining frequent ...
-
Mohd Shaharanee, Izwan Nizal (2012)Deriving useful and interesting rules from a data mining system are essential and important tasks. Problems such as the discovery of random and coincidental patterns or patterns with no significant values, and the generation ...