Most psychologists would agree that a well-designed employment test should yield evidence of criterion-related validity when tested against a well-measured criterion. If “validity generalization” (VG) were limited to this inference, then there would be no reason for this chapter. Indeed, the authors of this chapter subscribe to this inference, but VG is not limited to this inference. Instead, VG inferences are often extended to suggest that the magnitude of test validities are invariant across situations—that is, situations do not influence the magnitude of criterion-related validity coefficients. This line of thinking is aptly captured in quotes such as the following:

The evidence from these two studies appears to be the last nail required for the coffin of the situational specificity hypothesis.

(Schmidt, Hunter, Pearlman, & Rothstein-Hirsch, 1985, p. 758)

The cumulative pattern of findings … provides strong support for the hypothesis that there is essentially no situational variance in true validities for classic ability constructs used for selection on similar jobs.

(Schmidt et al., 1993, p. 11)

these studies found that, on average, all variance across settings (i.e., companies) was accounted for by artifacts…. All these pieces of interlocking evidence point in the same direction: toward the conclusion that, for employment tests of cognitive abilities, the situational specificity hypothesis is false.

(Hunter & Schmidt, 2004, pp. 404–405)

Beginning in 1977, Schmidt and Hunter began publishing empirical evidence discrediting the situational specificity hypothesis. Specifically, they demonstrated that much of the variability in validity coefficients across studies was due to random sampling error.

(McDaniel, Kepes, & Banks, 2011, p. 497) There is little question that (psychometrically well-developed) tests of knowledge, skills, abilities (i.e., KSAs), and personality traits generally predict (psychometrically well-developed) measures of organizationally relevant criteria. In this sense, the criterion-related validity evidence for these tests can be said to generalize. Whether the validity for a given type of predictor (e.g., critical intellectual skills) against a given class of criterion (e.g., job performance) is generally invariant across situations (i.e., cross-situationally consistent) is another issue. The cross-situational 94consistency hypothesis (i.e., VG) has endured a long history of theoretical and empirical debate, the roots of which can be traced, in part, to the person-situation debate (cf. Buss, 1979; Cronbach & Snow, 1977; Epstein, 1979; Hogan, 2009; Kendrick & Funder, 1988; Mischel, & Peake, 1982). The emergence of meta-analysis as a popular method for testing the consistency of predictive validities across a set of separate studies (e.g., situations) accelerated and transformed the debate into one of a more quantitative and methodological nature.