Rule-based analysis for detecting epistasis using associative classification mining
Access Status
Authors
Date
2015Type
Metadata
Show full item recordCitation
Source Title
ISSN
School
Collection
Abstract
The advancements in sequencing high-throughput human genome and computational abilities have tremendously improved the understanding of the genetic architecture behind the complex diseases. The development of high-throughput genotyping and next-generation sequencing technologies enables large-scale data for genetic epidemiological analysis. These advances led to the identification of a number of single nucleotide polymorphisms (SNPs) associated with complex diseases. The interactions between SNPs responsible for disease susceptibility have been increasingly explored in the current literature. These interaction studies are mathematically challenging and computationally complex. These challenges have been addressed by a number of data mining and machine learning approaches. The goal of this research is to implement associative classification and study its effectiveness for detecting the epistasis in balanced and imbalanced datasets. The proposed approach was evaluated for single-locus models to six-locus models using simulated data. The datasets were generated for five different penetrance functions by varying heritability, minor allele frequency and sample size. In total, 57,300 datasets were generated and several experiments conducted to identify the disease causal SNP interactions. The accuracy of classification by the proposed approach was compared with the existing approaches. The experimental results demonstrated significant improvements in accuracy for detecting interactions associated with the phenotype. Further, the approach was successfully applied over sporadic breast cancer data. The results show interaction among six polymorphisms, which included five different estrogen-metabolism genes.
Related items
Showing items related by title, author, creator and subject.
-
Uppu, S.; Krishna, Aneesh; Gopalan, Raj (2014)The genetic epidemiology behind the complex diseases are characterised by multiple factors acting together or independently. The complex network of these multiple factors induces pathological mechanisms which lead to ...
-
Uppu, S.; Krishna, Aneesh; Gopalan, Raj (2014)There have been many studies that depict genotype phenotype relationships by identifying genetic variants associated with a specific disease. Researchers focus more attention on interactions between SNPs that are strongly ...
-
Krishna, Aneesh (2016)In this era of genome-wide association studies (GWAS), the quest for understanding the genetic architecture of complex diseases is rapidly increasing more than ever before. The development of high throughput genotyping ...