A structure preserving flat data format representation for tree-structured data
dc.contributor.author | Hadzic, Fedja | |
dc.contributor.editor | L. Cao | |
dc.contributor.editor | J. Huang | |
dc.contributor.editor | J. Bailey | |
dc.contributor.editor | Y. Koh | |
dc.contributor.editor | J. Luo | |
dc.date.accessioned | 2017-01-30T14:06:49Z | |
dc.date.available | 2017-01-30T14:06:49Z | |
dc.date.created | 2012-03-07T20:01:06Z | |
dc.date.issued | 2012 | |
dc.identifier.citation | Hadzic, Fedja. 2012. A structure preserving flat data format representation for tree-structured data, in Cao, L. and Huang, J. and Bailey, J. and Koh, Y. and Luo, J. (ed), New Frontiers in Applied Data Mining, PAKDD 2011 International Workshops, May 24 2011, pp. 221-233. Shenzhen, China: Springer. | |
dc.identifier.uri | http://hdl.handle.net/20.500.11937/37719 | |
dc.description.abstract |
Mining of semi-structured data such as XML is a popular research topic due to many useful applications. The initial work focused mainly on values associated with tags, while most of recent developments focus on discovering association rules among tree structured data objects to preserve the structural information. Other data mining techniques have had limited use in tree-structured data analysis as they were mainly designed to process flat data format with no need to capture the structural properties of data objects. This paper proposes a novel structure-preserving way for representing tree-structured document instances as records in a standard flat data structure to enable applicability of a wider range of data analysis techniques. The experiments using synthetic and real world data demonstrate the effectiveness of the proposed approach. | |
dc.publisher | Springer | |
dc.relation.uri | http://conferences.telecom-bretagne.eu/data/qimie2011/hadzic-informal_QIMIE_2011.pdf | |
dc.subject | XML mining | |
dc.subject | decision tree learning from XML data | |
dc.subject | tree mining | |
dc.title | A structure preserving flat data format representation for tree-structured data | |
dc.type | Conference Paper | |
dcterms.source.startPage | 221 | |
dcterms.source.endPage | 233 | |
dcterms.source.title | New Frontiers in Applied Data Mining | |
dcterms.source.series | New Frontiers in Applied Data Mining | |
dcterms.source.isbn | 9783642283192 | |
dcterms.source.conference | PAKDD 2011 International Workshops | |
dcterms.source.conference-start-date | May 24 2011 | |
dcterms.source.conferencelocation | Shenzhen, China | |
dcterms.source.place | Berlin, Germany | |
curtin.department | Department of Computing | |
curtin.accessStatus | Fulltext not available |