In this chapter we discuss the results of two event-related potential (henceforth ERP) experiments which investigate the psychological validity of corpus-derived collocations. ERPs are momentary changes in brain electrical activity due to certain stimulus. There is a large body of literature showing that distinct ERP responses are elicited by reading sentences containing grammatical errors (e.g. Osterhout & Holcomb, 1992) and semantic errors (e.g. Kutas & Hillyard, 1980). There are also studies which investigate the ERP response to reading collocational errors in the absence of grammatical or semantic errors (Molinaro & Carreiras, 2010, Molinaro, Barraza, & Carreiras, 2013; Siyanova-Chanturia, Conklin, Caffarra, Kaan, & van Heuven, 2017). However, these studies tend to focus on idioms and other highly fixed multi-word expressions. In this chapter, we adopt a more fluid conceptualization of collocation by focusing on adjective-noun bigrams which have a high transition probability. Transition probability is calculated by dividing the number of times the bigram X-then-Y occurs in the written BNC1994 by the number of times word X occurs in the written BNC1994 altogether.

In Experiment 1, we show that reading a non-collocational bigram elicits a brain response known in ERP research as the N400 effect, where a negative voltage deflection occurs 400 ms after the onset of the stimulus. We replicate this finding using a different set of experimental bigrams in Experiment 2 Part 1. Then, in Experiment 2 Part 2, we show that there is a strong correlation between the transition probability of a collocational bigram and the amplitude of the N400. In Experiment 2 Part 2 we also correlate the amplitude of the N400 with other association measures (namely mutual information, MI3, z-score, t-score, log-likelihood, Dice coefficient, and raw frequency), showing that the association measures with the most psychological validity are hybrid measures, that is, those which combine both significance and effect size, or significance/effect size and frequency.