The analysis of the evolution of learning with graphical maps is based on the placement of the individuals in positions that are computed on the basis of their answers to certain tests. These techniques are useful for detecting similarities between the knowledge profiles of the subjects and can also be used for assessing the acquisition of capabilities after a course. In this paper, we propose to extend some graphical exploratory analysis techniques to the case where there are missing or conflicting answers in the tests. We will also consider that either a missing or unknown answer, or a set of conflictive answers to a survey, is aptly represented by an interval or a fuzzy set. This representation causes that each individual in the map is no longer a point but a figure whose shape and size determine the coherence of the answers and whose position with respect to its neighbors determines the similarities and differences between the individuals.
This article examines the reliability of statistical models that use visualization of word distances using computer-assisted text analysis. This study looks at the choice of parameters in the COOA - software for word co-occurrence analysis. The word co-occurrence analysis enables visualization of text structure through the exploration of the number of co-occurrences of words. The data visualization provided by a multi-dimensional scaling (MDS) procedure is susceptible to a particular form of error. The nonlinear relationship between words with significantly different frequencies lies at the root of this problem where words with higher frequencies are placed in the middle of a two-dimensional MDS map visualization. Words with lower frequency, on the other hand, are forced by the MDS estimator to the edge of the two-dimensional map and their estimated spatial positions are unstable. These two processes are potentially a major source of error in making inferences. One solution for reducing this source of error is to (a) reduce the number of words in a model or (b) increase of the number of model dimensions. This article, however, suggests that a detailed investigation of the word structure and a thorough analysis of the error sources and their meaningful interpretation may be a better solution., Václav Čepelák., and Obsahuje bibliografii
This article examines the reliability of statistical models that use visualization of word distances using computer-assisted text analysis. This study looks at the choice of parameters in the COOA - software for word co-occurrence analysis. The word co-occurrence analysis enables visualization of text structure through the exploration of the number of co-occurrences of words. The data visualization provided by a multi-dimensional scaling (MDS) procedure is susceptible to a particular form of error. The nonlinear relationship between words with significantly different frequencies lies at the root of this problem where words with higher frequencies are placed in the middle of a two-dimensional MDS map visualization. Words with lower frequency, on the other hand, are forced by the MDS estimator to the edge of the two-dimensional map and their estimated spatial positions are unstable. These two processes are potentially a major source of error in making inferences. One solution for reducing this source of error is to (a) reduce the number of words in a model or (b) increase of the number of model dimensions. This article, however, suggests that a detailed investigation of the word structure and a thorough analysis of the error sources and their meaningful interpretation may be a better solution.
Authors compared bird communities living in five mountain areas in the northern Croatia (Risnjak, Papuk, Medvednica, Ivanščica and Cesargrad mountain) using multivariate explorative techniques of qualitative and quantitative historical data. Similarity matrices were prepared based on Bray-Courtis similarity among samples. Non-metric multidimensional scaling (NMDS) and complete linkage clustering on qualitative and quantitative similarity matrix respectively were made. Principal component analysis (PCA) on quantitative data revealed bird species that contributed the most to the variability of samples. First three dimensions explain 75.2% of variance in samples (53.1%, 13.5% and 8.6% respectively) while the greatest loadings are caused by abundant species like Sylvia atricapilla, Erithacus rubecula, Turdus merula and Phylloscopus collybita. Non-metric multidimensional scaling revealed clear pattern in significant similarity among communities at low altitudes and at the same time – insignificant similarity among assemblages at different altitudes above the sea level (exception from the rule applies to the Papuk community at 600 m.a.s.l.). The clustering based on similarity matrix on qualitative data has shown clear separation among communities from different mountain areas. This study suggests that monitoring bird communities in the Croatian mountains must be designed as repeated sampling of quantitative data through time.