Mining unordered distance-constrained embedded subtrees
MetadataShow full item record
Frequent subtree mining is an important problem in the area of association rule mining from semi-structured or tree structured documents, often found in many commercial, web and scientific domains. This paper presents the u3Razor algorithm, for mining unordered embedded subtrees where the distance of nodes relative to the root of the subtree needs to be considered. Mining distance-constrained unordered embedded subtrees will have important applications in web information systems, conceptual model analysis and more sophisticated knowledge matching. An encoding strategy is presented to efficiently enumerate candidate unordered embedded subtrees taking the distance of nodes relative to the root of the subtree into account. Both synthetic and real-world datasets were used for experimental evaluation and discussion.
The original publication is available at http://www.springerlink.com
Showing items related by title, author, creator and subject.
Hadzic, Fedja; Tan, H.; Dillon, Tharam (2011)Table of Contents: Introduction - Tree Mining Problem - Algorithm Development Issues - Tree Model Guided Framework - TMG Framework for Mining Ordered Subtrees - TMG Framework for Mining Unordered Subtrees - Mining ...
Tan, H.; Dillon, Tharam S.; Hadzic, Fedja; Chang, Elizabeth (2006)Our work is focused on the task of mining frequent subtrees from a database of rooted ordered labelled subtrees. Previously we have developed an efficient algorithm, MB3 , for mining frequent embedded subtrees from a ...
Hadzic, Fedja; Tan, H.; Dillon, Tharam S. (2010)Large amount of online information is or can be represented using semi-structured documents, such as XML. The information contained in an XML document can be effectively represented using a rooted ordered labeled tree. ...