Parallel Lightweight Wavelet Tree, Suffix Array and FM-Index Construction

We present parallel lightweight algorithms to construct wavelet trees, rank and select structures, and suffix arrays in a shared-memory setting. The work and depth of our parallel wavelet tree algorithm matches that of the best existing algorithm while requiring asymptotically less memory. Our experiments show that it is both faster and more memory-efficient than existing parallel algorithms. We also present an experimental evaluation of the parallel construction of rank and select structures, which are used in wavelet trees. Next, we design the first parallel suffix array algorithm based on induced copying. The induced copying requires linear work and polylogarithmic depth for constant alphabets. When combined with a parallel prefix-doubling algorithm, it is more efficient in practice both in terms of running time and memory usage compared to existing parallel implementations. As an application, we combine our algorithms to build the FM-index in parallel.
Authors: Julian Labeit, Julian Shun, Guy Blelloch
Publication Date: March 2016
Conference: Data Compression Conference (DCC)
Download PDF: dcc2016