Lempel-Ziv factorization using less time & space

Chen, G.; Puglisi, Simon; Smyth, B.

doi:10.1007/s11786-007-0024-4

Access Status

Fulltext not available

Authors

Chen, G.

Puglisi, Simon

Smyth, B.

Date

2008

Type

Journal Article

Metadata

Show full item record

Citation

Chen, Gang and Puglisi, Simon and Smyth, Bill. 2008. Lempel-Ziv factorization using less time & space. Mathematics in Computer Science 1 (4): pp. 605-623.

Source Title

Mathematics in Computer Science

DOI

10.1007/s11786-007-0024-4

ISSN

16618270

Faculty

Curtin Business School

The Centre for Extended Enterprises and Business Intelligence (CEEBI)

School

Centre for Extended Enterprises and Business Intelligence

URI

http://hdl.handle.net/20.500.11937/45235

Collection

Curtin Research Publications

Abstract

For 30 years the Lempel-Ziv factorization LZ x of a string x = x[1..n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZ x was based on T(n)-time (or, depending on the measure used, O(n log n)-time) processing of the suffix tree ST x of x. Recently Abouelhoda et al. proposed an efficient Lempel-Ziv factorization algorithm based on an 'enhanced' suffix array - that is, a suffix array SA x together with supporting data structures, principally an 'interval tree'. In this paper we introduce a collection of fast space-efficient algorithms for LZ factorization, also based on suffix arrays, that in theory as well as in many practical circumstances are superior to those previously proposed; one family out of this collection achieves true T(n)-time alphabet-independent processing in the worst case by avoiding tree structures altogether.