A structure preserving flat data format representation for tree-structured data
MetadataShow full item record
Mining of semi-structured data such as XML is a popular research topic due to many useful applications. The initial work focused mainly on values associated with tags, while most of recent developments focus on discovering association rules among tree structured data objects to preserve the structural information. Other data mining techniques have had limited use in tree-structured data analysis as they were mainly designed to process flat data format with no need to capture the structural properties of data objects. This paper proposes a novel structure-preserving way for representing tree-structured document instances as records in a standard flat data structure to enable applicability of a wider range of data analysis techniques. The experiments using synthetic and real world data demonstrate the effectiveness of the proposed approach.
Showing items related by title, author, creator and subject.
Chang, Elizabeth; Tan, H.; Dillon, Tharam S.; Feng, L.; Hadzic, Fedja (2005)An XML enabled framework for representation of association rules in databases was first presented in [Feng03]. In Frequent Structure Mining (FSM), there are techniques proposed to mine frequent patterns from complex trees ...
Tan, H.; Hadzic, Fedja; Dillon, T. (2012)The increasing need for representing information through more complex structures where semantics and relationships among data objects can be more easily expressed has resulted in many semi-structured data sources. Structure ...
Hadzic, Fedja; Dillon, Tharam S.; Chang, Elizabeth (2008)Tree-structured knowledge representations are increasingly being used since the relationships between data objects can be represented in a more meaningful way. A number of tree mining algorithms were developed for mining ...