Detecting SNP Interactions in Balanced and Imbalanced Datasets using Associative Classification
Access Status
Authors
Date
2014Type
Metadata
Show full item recordCitation
Source Title
ISSN
School
Collection
Abstract
The genetic epidemiology behind the complex diseases are characterised by multiple factors acting together or independently. The complex network of these multiple factors induces pathological mechanisms which lead to disease manifestation. Advances in genotyping technology have dramatically increased the understanding of single nucleotide polymorphisms (SNPs) associated with complex diseases. The interactions between SNPs responsible for disease susceptibility are being intensively explored in this era of genome wide association studies (GWAS). Several machine learning and data mining approaches have been proposed to track the inheritance of the disease and its susceptibility towards the environmental factors. However, detecting these interactions continues to be a critical challenge due to bio-molecular complexities and computational limitations. The goal of this research is to study the effectiveness of associative classification for detecting the epistasis in balanced and imbalanced datasets. The proposed approach was evaluated for two locus epistasis interactions using simulated data. The datasets were generated for 5 different penetrance functions by varying heritability, minor allele frequency and sample size. In total, 23,400 datasets were generated and several experiments conducted to identify the disease causal SNP interactions. The accuracy of classification by the proposed approach was compared with the previous approaches. Though the associative classification showed small improvement in accuracy for balanced datasets, it outperformed existing approaches for higher order multi-locus interactions in imbalanced datasets.
Related items
Showing items related by title, author, creator and subject.
-
Uppu, S.; Krishna, Aneesh; Gopalan, Raj (2014)There have been many studies that depict genotype phenotype relationships by identifying genetic variants associated with a specific disease. Researchers focus more attention on interactions between SNPs that are strongly ...
-
Uppu, Suneetha; Krishna, Aneesh; Gopalan, Raj (2015)The advancements in sequencing high-throughput human genome and computational abilities have tremendously improved the understanding of the genetic architecture behind the complex diseases. The development of high-throughput ...
-
Uppu, S.; Krishna, Aneesh; Gopalan, R. (2016)The complexity of phenotype-genotype mapping are characterised by non-linear interactions between gene-gene and gene-environmental factors. These interaction studies provide better understanding of underlying biological ...