ABSTRACT

The advantage of this typology is mainly conceptual; the scheme makes it clear to which level the measurements properly belong, and how related variables can be created by aggregation or disaggregation. Historically, the problem of analyzing data from individuals nested within groups was “solved” by moving all variables by aggregation or disaggregation to one single level, followed by some standard (single-level) analysis method. A more sophisticated approach was the “slopes as outcomes” approach, where a separate analysis was carried out in each group and the estimates for all groups were collected in a group level data matrix for further analysis. A nice introduction to these historical analysis methods is given by Boyd and Iverson (1979). All these methods are flawed, because the analysis either ignores the different levels or treats them inadequately. Statistical criticism of these methods was expressed early after their adoption, for

example by Tate and Wongbundhit (1983) and de Leeuw and Kreft (1986). Better statistical methods were already available, for instance Hartley and Rao (1967) discuss estimation methods for the mixed model, which is essentially a multilevel model, and Mason, Wong, and Entwisle (1984) describe such a model for multilevel data, including software for its estimation. A nice summary of the state of the art around 1980 is given by van den Eeden and Hüttner (1982). The difference between the 1980 state of the art and the present (2010) situation is clear from its contents: there is a lot of discussion of (dis)aggregation and the “proper” level for the analysis, and of multiple regression tricks such as slopes as outcomes and other two-step procedures. There is no mention of statistical models as such, statistical dependency, random coefficients, or estimation methods. In short, what is missing is a principled statistical modeling approach.