Extracting powers and periods in a word from its runs structure
Access Status
Authors
Date
2014Collection
Type
Metadata
Show full item recordAbstract
A breakthrough in the field of text algorithms was the discovery of the fact that the maximal number of runs in a word of length n is O(n) and that they can all be computed in O(n) time. We study some applications of this result. New simpler O(n) time algorithms are presented for classical textual problems: computing all distinct kth word powers for a given k, in particular squares for k=2, and finding all local periods in a given word of length n. Additionally, we present an efficient algorithm for testing primitivity of factors of a word and computing their primitive roots. Applications of runs, despite their importance, are underrepresented in existing literature (approximately one page in the paper of Kolpakov and Kucherov, 1999 [25,26]). In this paper we attempt to fill in this gap. We use Lyndon words and introduce the Lyndon structure of runs as a useful tool when computing powers. In problems related to periods we use some versions of the Manhattan skyline problem.
Citation
Source Title
Department
Related items
Showing items related by title, author, creator and subject.

Crochemore, M.; Iliopoulos, Costas; Kubica, M.; Kubica, M.; Radoszewski, J.; Rytter, W.; Walen, T. (2014)A breakthrough in the field of text algorithms was the discovery of the fact that the maximal number of runs in a word of length n is O(n) and that they can all be computed in O(n) time. We study some applications of this ...

Mayall, Peter (2008)Learning and interpreting English in a commercial context is one of the hidden difficulties that students face when English is their second language. The difficulties are exacerbated when they are studying online or are ...

Williams, Robert (2006)Latent Semantic Analysis, when used for automated essay grading, makes use of document word count vectors for scoring the essays against domain knowledge. Words in the domain knowledge documents and essays are counted, ...