Effective pruning strategies for sequential pattern mining
MetadataShow full item record
In this paper, we systematically explore the search space of frequent sequence mining and present two novel pruning strategies, S E P (Sequence Extension Pruning) and I EP (Item Extension Pruning), which can be used in all Aption-like sequence mining algorithms or lattice-theoretic approaches. With a little more memory overhead, proposed pruning strategies can prune invalidated search space and decrease the total cost of frequency counting effectively. For effectiveness testing reason, we optimize SPAM [2) and present the improved algorithm, S P AMSEPIEP' which uses S E P and IEP to prune the search space by sharing the frequent 2sequences lists. A set of comprehensive performance experiments study shows that S P AMSEPIEP outperforms SPAM by a factor of 10 on small datasets and better than 30 % to 50 % on reasonably large dataset.
Copyright © 2008. IEEE This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder
Showing items related by title, author, creator and subject.
Ma, Zhixin; Xu, Yusheng; Dillon, Tharam S.; Chen, Xiaoyun (2008)In this paper, we systematically explore an itemset-based extension approach for generating candidate sequence which contributes to a better and more straightforward search space traversal performance than traditional ...
Hadzic, Fedja; Dillon, Tharam S (2006)The data collected for various domain purposes usually contains some features irrelevant tothe concept being learned. The presence of these features interferes with the learning mechanism and as a result the predicted ...
Case-Based Reasoning Approach to Construction Safety Hazard Identification: Adaptation and UtilizationGoh, Yang Miang; Chua, D. (2010)Risk assessment, consisting of hazard identification and risk analysis, is an important process that can prevent costly incidents. However, due to operational pressures and lack of construction experience, risk assessments ...