More Haste, Less Waste: Lowering the Redundancy in Fully Indexable Dictionaries

Roberto Grossi; Alessio Orlandi; Rajeev Raman; S. Srinivasa Rao

Conference Papers Year : 2009

More Haste, Less Waste: Lowering the Redundancy in Fully Indexable Dictionaries

(1) , (1) , (2) , (3)

1
2
3

Roberto Grossi

Function : Author
PersonId : 857937

Dipartimento di Informatica [Pisa]

Alessio Orlandi

Function : Author
PersonId : 858034

Dipartimento di Informatica [Pisa]

Rajeev Raman

Function : Author
PersonId : 858035

Department of Computer Science

S. Srinivasa Rao

Function : Author
PersonId : 858036

Madalgo Center

Abstract

We consider the problem of representing, in a compressed format, a bit-vector $S$ of $m$ bits with $n$ $1$s, supporting the following operations, where $b \in \{ 0, 1 \}$: $rank_b(S,i)$ returns the number of occurrences of bit $b$ in the prefix $S\left[1..i\right]$; $select_b(S,i)$ returns the position of the $i$th occurrence of bit $b$ in $S$. Such a data structure is called \emph{fully indexable dictionary (FID)} [Raman et al.,2007], and is at least as powerful as predecessor data structures. Our focus is on space-efficient FIDs on the \textsc{ram} model with word size $\Theta(\lg m)$ and constant time for all operations, so that the time cost is independent of the input size. Given the bitstring $S$ to be encoded, having length $m$ and containing $n$ ones, the minimal amount of information that needs to be stored is $B(n,m) = \lceil \log {{m}\choose{n}} \rceil$. The state of the art in building a FID for $S$ is given in [Patrascu,2008] using $B(m,n)+O( m / ( (\log m/ t) ^t) ) + O(m^{3/4}) $ bits, to support the operations in $O(t)$ time. Here, we propose a parametric data structure exhibiting a time/space trade-off such that, for any real constants $0 < \delta \leq 1/2$, $0 < \eps \leq 1$, and integer $s > 0$, it uses \[ B(n,m) + O\left(n^{1+\delta} + n \left(\frac{m}{n^s}\right)^\eps\right) \] bits and performs all the operations in time $O(s\delta^{-1} + \eps^{-1})$. The improvement is twofold: our redundancy can be lowered parametrically and, fixing $s = O(1)$, we get a constant-time FID whose space is $B(n,m) + O(m^\eps/\poly{n})$ bits, for sufficiently large $m$. This is a significant improvement compared to the previous bounds for the general case.

Keywords

algorithms data structures

Domains

Data Structures and Algorithms [cs.DS]

Fichier principal

grossi-hastewaste_new.pdf (247.64 Ko)

Origin : Files produced by the author(s)

Publications Loria : Connect in order to contact the contributor

https://inria.hal.science/inria-00360601

Submitted on : Monday, February 16, 2009-10:55:05 AM

Last modification on : Wednesday, January 4, 2023-3:22:08 PM

Long-term archiving on: Tuesday, June 8, 2010-10:15:30 PM

Dates and versions

inria-00360601 , version 1 (16-02-2009)

Identifiers

HAL Id : inria-00360601 , version 1
ARXIV : 0902.2648

Cite

Roberto Grossi, Alessio Orlandi, Rajeev Raman, S. Srinivasa Rao. More Haste, Less Waste: Lowering the Redundancy in Fully Indexable Dictionaries. 26th International Symposium on Theoretical Aspects of Computer Science STACS 2009, Feb 2009, Freiburg, Germany. pp.517-528. ⟨inria-00360601⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

STACS2009

64 View

156 Download

More Haste, Less Waste: Lowering the Redundancy in Fully Indexable Dictionaries

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Altmetric

Share