Razor: Mining distance-constrained embedded subtrees

Tan, H.; Dillon, Tharam S.; Hadzic, Fedja; Chang, Elizabeth

dc.contributor.author	Tan, H.
dc.contributor.author	Dillon, Tharam S.
dc.contributor.author	Hadzic, Fedja
dc.contributor.author	Chang, Elizabeth
dc.date.accessioned	2017-01-30T10:43:12Z
dc.date.available	2017-01-30T10:43:12Z
dc.date.created	2008-11-12T23:32:32Z
dc.date.issued	2006
dc.identifier.citation	Tan, Henry and Dillon, Tharam and Hadzic, Fedja and Chang, Elizabeth. 2006. : Razor: Mining distance-constrained embedded subtrees, in Tsumota, Shusaku (ed), IEEE International Conference on Data Mining Workshops, Dec 18 2006, pp. 8-13. Hong Kong: IEEE.
dc.identifier.uri	http://hdl.handle.net/20.500.11937/5006
dc.description.abstract	Our work is focused on the task of mining frequent subtrees from a database of rooted ordered labelled subtrees. Previously we have developed an efficient algorithm, MB3 [12], for mining frequent embedded subtrees from a database of rooted labeled and ordered subtrees. The efficiency comes from the utilization of a novel Embedding List representation for Tree Model Guided (TMG) candidate generation. As an extension the IMB3 [13] algorithm introduces the Level of Embedding constraint. In this study we extend our past work by developing an algorithm, Razor, for mining embedded subtrees where the distance of nodes relative to the root of the subtree needs to be considered. This notion of distance constrained embedded tree mining will have important applications in web information systems, conceptual model analysis and more sophisticated ontology matching. Domains representing their knowledge in a tree structured form may require this additional distance information as it commonly indicates the amount of specific knowledge stored about a particular concept within the hierarchy. The structure based approaches for schema matching commonly take the distance among the concept nodes within a sub-structure into account when evaluating the concept similarity across different schemas. We present an encoding strategy to efficiently enumerate candidate subtrees taking the distance of nodes relative to the root of the subtree into account. The algorithm is applied to both synthetic and real-world datasets, and the experimental results demonstrate the correctness and effectiveness of the proposed technique.
dc.publisher	IEEE
dc.subject	embedded subtree
dc.subject	structure matching
dc.subject	mining with constraints
dc.subject	frequent subtree mining
dc.subject	association mining
dc.title	Razor: Mining distance-constrained embedded subtrees
dc.type	Conference Paper
dcterms.source.startPage	8
dcterms.source.endPage	13
dcterms.source.title	Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
dcterms.source.series	Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
dcterms.source.conference	IEEE International Conference on Data Mining Workshops
dcterms.source.conference-start-date	Dec 18 2006
dcterms.source.conferencelocation	Hong Kong
dcterms.source.place	USA
curtin.note	Copyright 2006 IEEE
curtin.note	This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
curtin.department	Centre for Extended Enterprises and Business Intelligence
curtin.identifier	EPR-1258
curtin.accessStatus	Open access
curtin.faculty	Curtin Business School

Files in this item

Name:: 20227_downloaded_stream_215.pdf
Size:: 381.6Kb
Format:: PDF

This item appears in the following Collection(s)

Curtin Research Publications

Show simple item record

Razor: Mining distance-constrained embedded subtrees

Files in this item

This item appears in the following Collection(s)

Related items