Efficient algorithms for two extensions of LPF table: the power of suffix arrays

Crochemore, M.; Iliopoulos, Costas; Kubica, M.; Rytter, W.; Walen, T.

doi:10.1007/978-3-642-11266-9_25

dc.contributor.author	Crochemore, M.
dc.contributor.author	Iliopoulos, Costas
dc.contributor.author	Kubica, M.
dc.contributor.author	Rytter, W.
dc.contributor.author	Walen, T.
dc.contributor.editor	Jan v Leeuwen
dc.contributor.editor	Anca Muscholl
dc.contributor.editor	David Peleg
dc.contributor.editor	Jaroslav Pokorny
dc.contributor.editor	Bernhard Rumpe
dc.date.accessioned	2017-01-30T12:14:00Z
dc.date.available	2017-01-30T12:14:00Z
dc.date.created	2015-03-03T20:13:40Z
dc.date.issued	2010
dc.identifier.citation	Crochemore, M. and Iliopoulos, C. and Kubica, M. and Rytter, W. and Walen, T. 2010. Efficient algorithms for two extensions of LPF table: the power of suffix arrays, in Jan v Leeuwen, Anca Muscholl, David Peleg, Jaroslav Pokorny and Bernhard Rumpe (ed), 36th Conference on Current Trends in Theory and Practice of Computer Science,, Jan 23 2010, pp. 296-307. Spindleruv Mlyn, Czech Republic: Springer.
dc.identifier.uri	http://hdl.handle.net/20.500.11937/19459
dc.identifier.doi	10.1007/978-3-642-11266-9_25
dc.description.abstract	Su?x arrays provide a powerful data structure to solve several questions related to the structure of all the factors of a string. We show how they can be used to compute e?ciently two new tables storing di?erent types of previous factors (past segments) of a string. The concept of a longest previous factor is inherent to Ziv-Lempel factorization of strings in text compression, as well as in statistics of repetitions and symmetries. The longest previous reverse factor for a given position i is the longest factor starting at i, such that its reverse copy occurs before, while the longest previous non-overlapping factor is the longest factor v starting at i which has an exact copy occurring before. The previous copies of the factors are required to occur in the pre?x ending at position i -1. We design algorithms computing the table of longest previous reverse factors (LPrF table) and the table of longest previous nonoverlapping factors (LPnF table). The latter table is useful to computerepetitions while the former is a useful tool for extracting symmetries. These tables are computed, using two previously computed read-only arrays (SUF and LCP) composing the su?x array, in linear time on anyinteger alphabet. The tables have not been explicitly considered before, but they have several applications and they are natural extensions of the LPF table which has been studied thoroughly before. Our results improve on the previous ones in several ways. The running time of the computation no longer depends on the size of the alphabet, which drops a log factor. Moreover the newly introduced tables store additional information on the structure of the string, helpful to improve, for example, gapped palindrome detection and text compression using reverse factors. computing their primitive roots. Applications of runs, despite their importance, are underrepresented in existing literature (approximately one page in the paper of Kolpakov & Kucherov, 1999). In this paper we attempt to ?ll in this gap. We use Lyndon words and introduce the Lyndon structure of runs as a useful tool when computing powers. In problems related to periods we use some versions of the Manhattan skyline problem.
dc.publisher	Springer
dc.relation.uri	http://www.springerlink.com/content/5177t4t8k4m66112
dc.subject	Su?x Array
dc.subject	Longest previous reverse factor
dc.subject	longest previous factor
dc.subject	longest previous non-overlapping factor
dc.subject	runs
dc.subject	palindrome
dc.subject	text compression
dc.title	Efficient algorithms for two extensions of LPF table: the power of suffix arrays
dc.type	Conference Paper
dcterms.source.startPage	296
dcterms.source.endPage	307
dcterms.source.title	Lecture notes in computer science, volume 5091: theory and practice of computer science - SOFSEM 2010
dcterms.source.series	Lecture notes in computer science, volume 5091: theory and practice of computer science - SOFSEM 2010
dcterms.source.isbn	978-364205005-3
dcterms.source.conference	36th Conference on Current Trends in Theory and Practice of Computer Science,
dcterms.source.conference-start-date	Jan 23 2010
dcterms.source.conferencelocation	Spindleruv Mlyn, Czech Republic
dcterms.source.place	Heidelberg
curtin.department	Digital Ecosystems and Business Intelligence Institute (DEBII)
curtin.accessStatus	Fulltext not available

Files in this item

Name:: 214507_31448_PUB-CBS-EEB-MC-60 ...
Size:: 263.6Kb
Format:: PDF

This item appears in the following Collection(s)

Curtin Research Publications

Show simple item record

Efficient algorithms for two extensions of LPF table: the power of suffix arrays

Files in this item

This item appears in the following Collection(s)

Related items