ABSTRACT

As in other fields of biomedicine, the research and practice of radiation oncology has entered the big data era. As manifested in other chapters of this book as well as recent publications, the collected data are much more challenging than those collected in the small data era. Such data have a significantly higher dimensionality and size, a higher level of heterogeneity across samples and data sets, and more complex interconnections (regulations) among variables. Data analysis thus demands new and sophisticated methods that are statistically sound and computationally feasible. In this chapter, we provide a brief and partial survey of recently developed big data methods. Special attention is paid to the analysis of multidimensional studies (which collect multiple types of high-dimensional measurements on the same subjects) and the analysis of multiple independent data sets. The computation aspect, which poses challenges not encountered in the small data era, is also discussed. The rationale, operating characteristics, advantages, and pitfalls are briefly discussed for the reviewed methods.