This is the first of two chapters that outline the evidence base for the remainder of this book. It is important both because, as standard, it provides readers with a basis for judging the security of the findings, and because it explains why those findings are original and different to what is often merely assumed to be true. This chapter presents a brief definition of terms relevant to educational outcomes and social disadvantage, and a summary of some relevant existing datasets. Existing and official statistics are much more complete, longer-term and larger in scale than anything that even a very full programme of primary research can provide. They are vital to the task of describing patterns of attainment and participation in education. Of course, they also have well-known limitations. They were usually collected for a different purpose to that for which they are used here, and they do not tessellate with or complement each other well, even when set in the same time period and geographic region. One such example for the UK is that individual student records for schools use poverty as an indicator of parental background, whereas individual student data for higher education uses parental occupation. Both measures are useful but they are not equivalent and neither can be converted to the other. Another important issue for this book is that the datasets for different countries do not match. This book makes use of primary international datasets and reviews of evidence worldwide (Chapter 3), and makes reference to results from international studies such as Trends in International Mathematics and Science Study (TIMSS), Programme for International Student Assessment (PISA) and Progress in International Reading Literacy Study (PIRLS), and teacher effectiveness work in the US. The most comprehensive datasets used in the book are from the UK or England. However, there are further crucial limitations to even these highquality and world-leading datasets that are not usually acknowledged by other commentators. These problems are covered in some detail in this chapter, and they begin to show why the results in this book will be a challenge to some established beliefs, in areas such as widening participation and school and teacher performance. The chapter therefore provides a caution about how far existing data can be stretched without providing misleading results.