Show simple item record

dc.contributor.authorPuglisi, Simon
dc.contributor.authorSmyth, William
dc.contributor.authorYusufu, M.
dc.date.accessioned2017-01-30T13:57:40Z
dc.date.available2017-01-30T13:57:40Z
dc.date.created2011-03-20T20:01:53Z
dc.date.issued2010
dc.identifier.citationPuglisi, Simon J. and Smyth, W.F. and Yusufu, Munina. 2010. Fast, Practical Algorithms for Computing All the Repeats in a String. Mathematics in Computer Science. 3 (4): pp. 373-389.
dc.identifier.urihttp://hdl.handle.net/20.500.11937/36774
dc.identifier.doi10.1007/s11786-010-0033-6
dc.description.abstract

Given a string x = x[1..n] on an alphabet of size α, and a threshold p min ≥ 1, we describe four variants of an algorithm PSY1 that, using a suffix array, computes all the complete nonextendible repeats in x of length p ≥ p min . The basic algorithm PSY1–1 and its simple extension PSY1–2 are fast on strings that occur in biological, natural language and other applications (not highly periodic strings), while PSY1–3 guarantees Θ(n) worst-case execution time. The final variant, PSY1–4, also achieves Θ(n) processing time and, over the complete range of strings tested, is the fastest of the four. The space requirement of all four algorithms is about 5n bytes, but all make use of the “longest common prefix” (LCP) array, whose construction requires about 6n bytes. The four algorithms are faster in applications and use less space than a recently-proposed algorithm (Narisawa in Proceedings of 18th Annual Symposium on Combinatorial Pattern Matching, pp. 340–351, 2007) that produces equivalent output. The suffix array is not explicitly used by algorithms PSY1, but may be required for postprocessing; in this case, storage requirements rise to 9n bytes. We also describe two variants of a fast Θ(n)-time algorithm PSY2 for computing all complete supernonextendible repeats in x.

dc.publisherSpringer
dc.subjectRepeat – Repetition – Suffix array – Suffix tree
dc.titleFast, Practical Algorithms for Computing All the Repeats in a String
dc.typeJournal Article
dcterms.source.volume3
dcterms.source.number4
dcterms.source.startPage373
dcterms.source.endPage389
dcterms.source.issn16618270
dcterms.source.titleMathematics in Computer Science
curtin.note

The original publication is available at http://www.springerlink.com

curtin.departmentDigital Ecosystems and Business Intelligence Institute (DEBII)
curtin.accessStatusOpen access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record