Detecting SNP Interactions in Balanced and Imbalanced Datasets using Associative Classification
dc.contributor.author | Uppu, S. | |
dc.contributor.author | Krishna, Aneesh | |
dc.contributor.author | Gopalan, Raj | |
dc.date.accessioned | 2017-01-30T11:43:56Z | |
dc.date.available | 2017-01-30T11:43:56Z | |
dc.date.created | 2015-03-31T20:00:25Z | |
dc.date.issued | 2014 | |
dc.identifier.citation | Uppu, S. and Krishna, A. and Gopalan, R. 2014. Detecting SNP Interactions in Balanced and Imbalanced Datasets using Associative Classification. Australian Journal of Intelligent Information Processing Systems. 14 (1): pp. 7-18. | |
dc.identifier.uri | http://hdl.handle.net/20.500.11937/14459 | |
dc.description.abstract |
The genetic epidemiology behind the complex diseases are characterised by multiple factors acting together or independently. The complex network of these multiple factors induces pathological mechanisms which lead to disease manifestation. Advances in genotyping technology have dramatically increased the understanding of single nucleotide polymorphisms (SNPs) associated with complex diseases. The interactions between SNPs responsible for disease susceptibility are being intensively explored in this era of genome wide association studies (GWAS). Several machine learning and data mining approaches have been proposed to track the inheritance of the disease and its susceptibility towards the environmental factors. However, detecting these interactions continues to be a critical challenge due to bio-molecular complexities and computational limitations. The goal of this research is to study the effectiveness of associative classification for detecting the epistasis in balanced and imbalanced datasets. The proposed approach was evaluated for two locus epistasis interactions using simulated data. The datasets were generated for 5 different penetrance functions by varying heritability, minor allele frequency and sample size. In total, 23,400 datasets were generated and several experiments conducted to identify the disease causal SNP interactions. The accuracy of classification by the proposed approach was compared with the previous approaches. Though the associative classification showed small improvement in accuracy for balanced datasets, it outperformed existing approaches for higher order multi-locus interactions in imbalanced datasets. | |
dc.publisher | Australian National University | |
dc.subject | associative classification | |
dc.subject | Epistasis | |
dc.subject | SNP interactions | |
dc.subject | multi-locus | |
dc.title | Detecting SNP Interactions in Balanced and Imbalanced Datasets using Associative Classification | |
dc.type | Journal Article | |
dcterms.source.volume | 14 | |
dcterms.source.number | 1 | |
dcterms.source.startPage | 7 | |
dcterms.source.endPage | 18 | |
dcterms.source.issn | 13212133 | |
dcterms.source.title | Australian Journal of Intelligent Information Processing Systems | |
curtin.department | Department of Computing | |
curtin.accessStatus | Open access |