ABSTRACT

A vast stream of data is generated by the routine operations of modern cancer diagnosis and oncologic treatments. Usually, the data is stored electronically but it tends to be scattered across different disciplines (e.g., radiation oncology, radiology, medical oncology, surgery), different data storage platforms (e.g., electronic medical record [EMR], Picture Archiving Communication System [PACS]) and in a wide variety of formats (e.g., Digital Imaging and Communication in Medicine [DICOM], ASCII, PDF) (Deng 2014). In addition, data in cancer care has the particular properties of large volume and complex dependencies between data elements, which is creating growing difficulties for conventional methods of data handling. By handling, we refer to collection, storage, update, and exchange. Although the variety and volume of big data continues to grow exponentially within the field of oncology (Chen et al. 2014), it has not been easy to exploit this rich vein of data to improve patient safety and health outcomes (McNutt et al. 2016).