ABSTRACT

Since the 1990s, corpus-based research has been the drive behind key methodological advances—not least the development of techniques to quantitatively interrogate large computer-held collections of texts—in the neighbouring fields of linguistics, lexicography and translation studies, to name but a few examples. Corpora have become powerful and reliable tools to compile representative samples of authentic texts pertaining to various language varieties and genres, test hypotheses regarding the frequency and regularity of certain linguistic patterns that cannot be verified by the researcher’s intuition, and support the generalization of linguistic findings. Effectively, corpora expose the limitations of linguistic research based on made-up examples and the analyst’s subjective judgement.