## ABSTRACT

One of the problems with many sets of multivariate data is that there are simply too many variables to make the application of, say, some of the graphical techniques described in Chapter 2 successful in providing an informative initial assessment of the data. Further, having too many variables may cause problems for other statistical techniques that the researcher may want to apply to the data. The possible problem of too many variables is sometimes known as the curse of dimensionality. Clearly, the scatterplots, scatterplot matrices, and other graphics that might be applied to multivariate data for an initial assessment are likely to be more useful when the number of variables in the data, the dimensionality of the data, is relatively small rather than large. This brings us to principal components analysis (PCA), a multivariate technique with the central aim of reducing the dimensionality of a multivariate data set while retaining as much as possible of the variation present in it. This aim is achieved by transforming to a new set of variables the principal components that are uncorrelated and that are ordered, so that the first few of them account for most of the variation in all the original variables. In the best of all possible worlds, the result of a PCA would be the creation of a small number of new variables that can be used as surrogates for the originally large number of variables and, consequently, that provide a simpler basis for, say, graphing or summarizing the data and also, perhaps, when undertaking further multivariate analyses of the data.