Stabilizing sparse Cox model using statistic and semantic structures in electronic medical records
Access Status
Authors
Date
2015Type
Metadata
Show full item recordCitation
Source Title
ISBN
School
Collection
Abstract
Stability in clinical prediction models is crucial for transferability between studies, yet has received little attention. The problem is paramount in high dimensional data, which invites sparse models with feature selection capability. We introduce an effective method to stabilize sparse Cox model of time-to-events using statistical and semantic structures inherent in Electronic Medical Records (EMR). Model estimation is stabilized using three feature graphs built from (i) Jaccard similarity among features (ii) aggregation of Jaccard similarity graph and a recently introduced semantic EMR graph (iii) Jaccard similarity among features transferred from a related cohort. Our experiments are conducted on two real world hospital datasets: a heart failure cohort and a diabetes cohort. On two stability measures - the Consistency index and signal-to-noise ratio (SNR) - the use of our proposed methods significantly increased feature stability when compared with the baselines.
Related items
Showing items related by title, author, creator and subject.
-
Tran, The Truyen; Phung, D.; Luo, W.; Venkatesh, S. (2014)The recent wide adoption of electronic medical records (EMRs) presents great opportunities and challenges for data mining. The EMR data are largely temporal, often noisy, irregular and high dimensional. This paper constructs ...
-
Gopakumar, S.; Tran, The Truyen; Nguyen, T.; Phung, D.; Venkatesh, S. (2015)© 2014 IEEE. We investigate feature stability in the context of clinical prognosis derived from high-dimensional electronic medical records. To reduce variance in the selected features that are predictive, we introduce ...
-
Nordin, Syarifah Zyurina (2011)Task scheduling in parallel processing systems is one of the most challenging industrial problems. This problem typically arises in the manufacturing and service industries. The task scheduling problem is to determine a ...