Skip to Main content Skip to Navigation
New interface
Conference papers

Interpretable Topic Extraction and Word Embedding Learning Using Row-Stochastic DEDICOM

Abstract : The DEDICOM algorithm provides a uniquely interpretable matrix factorization method for symmetric and asymmetric square matrices. We employ a new row-stochastic variation of DEDICOM on the pointwise mutual information matrices of text corpora to identify latent topic clusters within the vocabulary and simultaneously learn interpretable word embeddings. We introduce a method to efficiently train a constrained DEDICOM algorithm and a qualitative evaluation of its topic modeling and word embedding performance.
Complete list of metadata

https://hal.inria.fr/hal-03414746
Contributor : Hal Ifip Connect in order to contact the contributor
Submitted on : Thursday, November 4, 2021 - 3:58:20 PM
Last modification on : Friday, November 5, 2021 - 3:57:59 AM
Long-term archiving on: : Saturday, February 5, 2022 - 7:10:30 PM

File

 Restricted access
To satisfy the distribution rights of the publisher, the document is embargoed until : 2023-01-01

Please log in to resquest access to the document

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Lars Hillebrand, David Biesner, Christian Bauckhage, Rafet Sifa. Interpretable Topic Extraction and Word Embedding Learning Using Row-Stochastic DEDICOM. 4th International Cross-Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE), Aug 2020, Dublin, Ireland. pp.401-422, ⟨10.1007/978-3-030-57321-8_22⟩. ⟨hal-03414746⟩

Share

Metrics

Record views

65