Fast algorithms for approximate circular string matching
Access Status
Authors
Date
2014Type
Metadata
Show full item recordCitation
Source Title
ISSN
School
Collection
Abstract
Background: Circular string matching is a problem which naturally arises in many biological contexts. It consists in finding all occurrences of the rotations of a pattern of length m in a text of length n. There exist optimal average-case algorithms for exact circular string matching. Approximate circular string matching is a rather undeveloped area.Results: In this article, we present a suboptimal average-case algorithm for exact circular string matching requiring time O(n). Based on our solution for the exact case, we present two fast average-case algorithms for approximate circular string matching with k-mismatches, under the Hamming distance model, requiring time O(n) for moderate values of k, that is k = O(m/ logm). We show how the same results can be easily obtained under the edit distancemodel. The presented algorithms are also implemented as library functions. Experimental results demonstrate thatthe functions provided in this library accelerate the computations by more than three orders of magnitude compared to a naïve approach.Conclusions: We present two fast average-case algorithms for approximate circular string matching with k-mismatches; and show that they also perform very well in practice. The importance of our contribution is underlined by the fact that the provided functions may be seamlessly integrated into any biological pipeline. The source code of the library is freely available at http://www.inf.kcl.ac.uk/research/projects/asmf/.
Related items
Showing items related by title, author, creator and subject.
-
Kim, J.; Eades, P.; Fleischer, R.; Hong, S.; Iliopoulos, Costas; Park, K.; Puglisi, Simon; Tokuyama, T. (2014)We introduce a new string matching problem called order-preserving matching on numeric strings, where a pattern matches a text if the text contains a substring of values whose relative orders coincide with those of the ...
-
Smyth, Bill; Wang, S. (2009)We describe a hybrid pattern-matching algorithm that works on both regular and indeterminate strings. This algorithm is inspired by the recently proposed hybrid algorithm FJS and its indeterminate successor. However, as ...
-
Smyth, Bill; Wang, Shu; Yu, Mao (2008)We describe a hybrid pattern-matching algorithm that works on both regular and indeterminate strings. This algorithm is inspired by the recently proposed hybrid algorithm FJS [11] and its indeterminate successor [15]. ...