Mining substructures in protein data

Hadzic, Fedja; Dillon, Tharam S.; Sidhu, Amandeep; Chang, Elizabeth; Tan, H.

dc.contributor.author	Hadzic, Fedja
dc.contributor.author	Dillon, Tharam S.
dc.contributor.author	Sidhu, Amandeep
dc.contributor.author	Chang, Elizabeth
dc.contributor.author	Tan, H.
dc.date.accessioned	2017-01-30T10:58:50Z
dc.date.available	2017-01-30T10:58:50Z
dc.date.created	2008-11-12T23:32:32Z
dc.date.issued	2006
dc.identifier.citation	Hadzic, Fedja and Dillon, Tharam and Sidhu, Amandeep and Chang, Elizabeth and Tan, Henry. 2006. : Mining substructures in protein data, in Tsumoto, Shusaku (ed), IEEE International Conference on Data Mining Workshops, Dec 18 2006, pp. 213-217. Hong Kong: IEEE.
dc.identifier.uri	http://hdl.handle.net/20.500.11937/7278
dc.description.abstract	In this paper we consider the 'Prions' database that describes protein instances stored for Human Prion Proteins. The Prions database can be viewed as a database of rooted ordered labeled subtrees. Mining frequent substructures from tree databases is an important task and it has gained a considerable amount of interest in areas such as XML mining, Bioinformatics, Web mining etc. This has given rise to the development of many tree mining algorithms which can aid in structural comparisons, association rule discovery and in general mining of tree structured knowledge representations. Previously we have developed the MB3 tree mining algorithm, which given a minimum support threshold, efficiently discovers all frequent embedded subtrees from a database of rooted ordered labeled subtrees. In this work we apply the algorithm to the Prions database in order to extract the frequently occurring patterns, which in this case are of induced subtree type. Obtaining the set of frequent induced subtrees from the Prions database can potentially reveal some useful knowledge. This aspect will be demonstrated by providing an analysis of the extracted frequent subtrees with respect to discovering interesting protein information. Furthermore, the minimum support threshold can be used as the controlling factor for answering specific queries posed on the Prions dataset. This approach is shown to be a viable technique for mining protein data.
dc.publisher	IEEE
dc.subject	structure matching
dc.subject	Protein discovery
dc.subject	frequent subtree mining
dc.subject	association mining
dc.title	Mining substructures in protein data
dc.type	Conference Paper
dcterms.source.startPage	213
dcterms.source.endPage	217
dcterms.source.title	Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
dcterms.source.series	Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
dcterms.source.conference	IEEE International Conference on Data Mining Workshops
dcterms.source.conference-start-date	Dec 18 2006
dcterms.source.conferencelocation	Hong Kong
dcterms.source.place	USA
curtin.note	Copyright 2006 IEEE
curtin.note	This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
curtin.department	Centre for Extended Enterprises and Business Intelligence
curtin.identifier	EPR-1259
curtin.accessStatus	Open access
curtin.faculty	Curtin Business School

Files in this item

Name:: 20226_downloaded_stream_214.pdf
Size:: 330.9Kb
Format:: PDF

This item appears in the following Collection(s)

Curtin Research Publications

Show simple item record

Mining substructures in protein data

Files in this item

This item appears in the following Collection(s)

Related items