Tytuł pozycji:
Spectral clustering and principal component analysis as tools for variable transformation in symbolic interval-valued data ensembles
The selection, weighting and transformation of variables are essential phases of the modelling process. Two approaches can be applied to improve a model's accuracy: the selection of variables and the transformation of variables. In symbolic data analysis, two different approaches can be adopted: principal component analysis (PCA) and spectral clustering. In all the cases, we initially start with a set of symbolic variables and, after transformation, we obtain either classical variables (single numeric values) or symbolic variables that can be used in various models. The paper presents and compares PCA and spectral clustering for symbolic data when dealing with the problem of variable transformation. Artificial data with a known cluster structure were used to compare both single and ensemble clustering approaches. The results suggest that spectral clustering achieves better results for single and ensemble models.