Skip to Main content Skip to Navigation
Conference papers

On Correlation Polynomials and Subword Complexity

Abstract : We consider words with letters from a $q-ary$ alphabet $\mathcal{A}$. The kth subword complexity of a word $w ∈\mathcal{A}^*$ is the number of distinct subwords of length $k$ that appear as contiguous subwords of $w$. We analyze subword complexity from both combinatorial and probabilistic viewpoints. Our first main result is a precise analysis of the expected $kth$ subword complexity of a randomly-chosen word $w ∈\mathcal{A}^n$. Our other main result describes, for $w ∈\mathcal{A}^*$, the degree to which one understands the set of all subwords of $w$, provided that one knows only the set of all subwords of some particular length $k$. Our methods rely upon a precise characterization of overlaps between words of length $k$. We use three kinds of correlation polynomials of words of length $k$: unweighted correlation polynomials; correlation polynomials associated to a Bernoulli source; and generalized multivariate correlation polynomials. We survey previously-known results about such polynomials, and we also present some new results concerning correlation polynomials.
Complete list of metadata

Cited literature [28 references]  Display  Hide  Download

https://hal.inria.fr/hal-01184801
Contributor : Coordination Episciences Iam Connect in order to contact the contributor
Submitted on : Monday, August 17, 2015 - 5:00:30 PM
Last modification on : Friday, June 28, 2019 - 2:48:03 PM
Long-term archiving on: : Wednesday, November 18, 2015 - 12:18:52 PM

File

dmAH0101.pdf
Publisher files allowed on an open archive

Identifiers

Collections

Citation

Irina Gheorghiciuc, Mark Daniel Ward. On Correlation Polynomials and Subword Complexity. 2007 Conference on Analysis of Algorithms, AofA 07, 2007, Juan les Pins, France. pp.1-18, ⟨10.46298/dmtcs.3553⟩. ⟨hal-01184801⟩

Share

Metrics

Record views

69

Files downloads

660