A search of significant phrases for building topic models in text documents

Szczegóły
Abstrakt

Tytuł:: A search of significant phrases for building topic models in text documents
Autorzy:: Ożdżyński, P.
Zakrzewska, D.
Data publikacji:: 2016
Słowa kluczowe:: topic model
frequent sequences
LDA
Język:: angielski
Dostawca treści:: BazTech
: Artykuł

A huge amount of documents in the digitalized libraries requires efficient methods for exploring contained there information. ìTopic modelingî is considered as one of the most effective among them. In spite of commonly used approaches for finding occurrences of single words, in the paper building topic models based on phrases is pondered. We propose a methodology, which enables to create a set of significant word sequences and thus limiting the search area to phrases which contain them. The methodology is evaluated on experiments performed on real text datasets. Obtained results are compared with those received by using LDA algorithm.

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę.

Informacja

A search of significant phrases for building topic models in text documents