Methods of clustering text documents used in employee recruitment process

Szczegóły
Opis

Tytuł:: Methods of clustering text documents used in employee recruitment process
Metody grupowania dokumentów tekstowych wykorzystywanych w procesie rekrutacji pracowników
Autorzy:: Antkiewicz, Patryk
Słowa kluczowe:: grupowanie dokument tekst rekrutacja algorytm
clustering text document recruitment algorithm
Język:: polski
Dostawca treści:: Repozytorium Uniwersytetu Jagiellońskiego
: Inne

Przejdź do źródła

The aim of this study is to present popular document clustering techniques and analyze the efficiency of selected methods in regard to specific type of data – text documents used in employee recruitment process (CV). An attempt will be made to assess which of these algorithms can help in effective organization of this kind of data in large corporations and websites that support employee recruitment process. In order to achieve this goal, relevant literature has been consulted and several commonly used algorithms have been implemented.Then, a series of experiments have been performed on previously prepared datasets. Because of a large number of existing approaches to the problem of clustering, the conducted research focuses mainly on the most common method of text document representation - vector space model (VSM) .

Celem niniejszej pracy jest zaprezentowanie popularnych technik grupowania dokumentów tekstowych oraz analiza skuteczności kilku wybranych metod w odniesieniu do specyficznego rodzaju danych, jaki stanowią dokumenty wykorzystywane w procesie rekrutacji pracowników (CV). Podjęta zostanie próba oceny, które z algorytmów mogą znaleźć praktyczne zastosowanie i ułatwić efektywną organizację danych tego typu w dużych korporacjach oraz serwisach internetowych wspomagających proces rekrutacji.Aby osiągnąć powyższy cel, dokonano przeglądu literatury dotyczącej tematyki grupowania, stworzono aplikację implementującą kilka powszechnie stosowanych algorytmów oraz przeprowadzono szereg eksperymentów na uprzednio przygotowanych zbiorach testowych. Ze względu na mnogość istniejących podejść do problemu, w przeprowadzonych badaniach skupiono się przede wszystkim na najczęściej stosowanym, wektorowym modelu reprezentacji dokumentów.

Informacja

Methods of clustering text documents used in employee recruitment process