Spectral learning with proper probabilities for finite state automation

Abstract : Probabilistic Finite Automaton (PFA), Probabilistic Finite State Transducers (PFST) and Hidden Markov Models (HMM) are widely used in Automatic Speech Recognition (ASR), Text-to-Speech (TTS) systems and Part Of Speech (POS) tagging for language mod-eling. Traditionally, unsupervised learning of these latent variable models is done by Expectation-Maximization (EM)-like algorithms, as the Baum-Welch algorithm. In a recent alternative line of work, learning algorithms based on spectral properties of some low order moments matrices or tensors were proposed. In comparison to EM, they are orders of magnitude faster and come with theoretical convergence guarantees. However, returned models are not ensured to compute proper distributions. They often return negative values that do not sum to one, limiting their applicability and preventing them to serve as an initialization to EM-like algorithms. In this paper, we propose a new spectral algorithm able to learn a large range of models constrained to return proper distributions. We assess its performances on synthetic problems from the PAutomaC challenge and real datasets extracted from Wikipedia. Experiments show that it outperforms previous spectral approaches as well as the Baum-Welch algorithm with random restarts, in addition to serve as an efficient initialization step to EM-like algorithms.
Complete list of metadatas

Cited literature [23 references]  Display  Hide  Download

https://hal.inria.fr/hal-01225810
Contributor : Olivier Pietquin <>
Submitted on : Monday, November 9, 2015 - 3:00:01 PM
Last modification on : Thursday, April 4, 2019 - 10:18:05 AM
Long-term archiving on : Wednesday, February 10, 2016 - 10:09:20 AM

File

ASRU_2015_HGCEOP.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01225810, version 1

Citation

Hadrien Glaude, Cyrille Enderli, Olivier Pietquin. Spectral learning with proper probabilities for finite state automation. ASRU 2015 - Automatic Speech Recognition and Understanding Workshop, Dec 2015, Scottsdale, United States. ⟨hal-01225810⟩

Share

Metrics

Record views

426

Files downloads

180