Data integration through protein ontology
Access Status
Authors
Date
2007Type
Metadata
Show full item recordCitation
Source Title
Faculty
School
Remarks
This chapter appears in Data Mining with Ontologies: Implementations, Findings, and Frameworks, edited by H. O. Nigro, S. E. G. Cisaro and D. H. Xodo.
Copyright 2006, IGI Global, www.igi-global.com. Posted by permission of the publisher.
Collection
Abstract
Traditional approaches to integrate protein data generally involved keyword searches, which immediately excludes unannotated or poorly annotated data. An alternative protein annotation approach is to rely on sequence identity, structural similarity, or functional identification. Some proteins have a high degree of sequence identity, structural similarity, or similarity in functions that are unique to members of that family alone. Consequently, this approach can not be generalized to integrate the protein data. Clearly, these traditional approaches have limitations in capturing and integrating data for protein annotation. For these reasons, we have adopted an alternative method that does not rely on keywords or similarity metrics, but instead uses ontology. In this chapter we discuss conceptual framework of protein ontology that has a hierarchical classification of concepts represented as classes, from general to specific; a list of attributes related to each concept, for each class; a set of relations between classes to link concepts in ontology in more complicated ways then implied by the hierarchy, to promote reuse of concepts in the ontology; and a set of algebraic operators for querying protein ontology instances.
Related items
Showing items related by title, author, creator and subject.
-
Chang, Elizabeth; Sidhu, Amandeep; Dillon, Tharam S. (2005)Protein Data Integration approaches at the moment considers data sources as data repositories, but not as applications; which in turn may embody complex interactions with other data sources. Current approaches do not ...
-
Sidhu, Amandeep; Dillon, Tharam S.; Chang, Elizabeth (2007)Two factors dominate current developments in structural bioinformatics, especially in protein informatics and related areas: (1) the amount of raw data is increasing, very rapidly; and (2) successful application of data ...
-
Sidhu, Amandeep; Dillon, Tharam S.; Hussain, Farookh Khadeer; Chang, Elizabeth (2006)Recent progress in proteomics, computational biology, and ontology development has presented an opportunity to investigate protein data sources from unique perspective that is, examining protein data sources through ...