As Firth (1957:11) has rightly stated with his famous sentence: “You shall know a word by the company it keeps”, we need to look at the use of words before we attempt to describe them. A single human, or even a group of expert researchers, cannot be expected to know every use of every word or, more generally, of every linguistic phenomenon produced by all speakers of a language; we thus need to collect samples of language in use (and compile them in a way computationally that they can be studied by humans). It is moreover argued that such empiricism is an adequate practice aiming at widening the lexicographic (and linguistic) horizon, as Douglas Biber and Randi Reppen (2015:2) state: “corpus analyses have documented the existence of linguistic constructs that are not recognized by current linguistic theories”.