Skip to Main content Skip to Navigation
Conference papers

PPLSA: Parallel Probabilistic Latent Semantic Analysis Based on MapReduce

Abstract : PLSA(Probabilistic Latent Semantic Analysis) is a popular topic modeling technique for exploring document collections. Due to the increasing prevalence of large datasets, there is a need to improve the scalability of computation in PLSA. In this paper, we propose a parallel PLSA algorithm called PPLSA to accommodate large corpus collections in the MapReduce framework. Our solution efficiently distributes computation and is relatively simple to implement.
Document type :
Conference papers
Complete list of metadata

Cited literature [9 references]  Display  Hide  Download

https://hal.inria.fr/hal-01524958
Contributor : Hal Ifip <>
Submitted on : Friday, May 19, 2017 - 10:43:18 AM
Last modification on : Friday, September 11, 2020 - 2:30:03 PM

File

978-3-642-32891-6_8_Chapter.pd...
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Ning Li, Fuzhen Zhuang, Qing He, Zhongzhi Shi. PPLSA: Parallel Probabilistic Latent Semantic Analysis Based on MapReduce. 7th International Conference on Intelligent Information Processing (IIP), Oct 2012, Guilin, China. pp.40-49, ⟨10.1007/978-3-642-32891-6_8⟩. ⟨hal-01524958⟩

Share

Metrics

Record views

120

Files downloads

321