ABSTRACT

The goal of data mining (DM), also called data analytics or predictive analytics, is to make sense of big data. To quote from a recent NSF (n.d.) posting, the term big data “refers to large, diverse, complex, longitudinal, and/or distributed data sets generated from instruments, sensors, Internet transactions, email, video, click streams, and/or all other digital sources available today and in the future.” The main challenges in making sense of big data are the number of data points (size), number of features/attributes describing the data (dimensionality of the space), number of values each feature takes on, and heterogeneity of data types (images, signals, text/interviews, health records, etc. (Cios & Moore, 2002; Cios, Pedrycz, Swiniarski, & Kurgan, 2007; Wu, Zhu, Wu, & Ding, 2013). Thus, DM is not about analyzing small, structured data sets that can be modeled using classical techniques (Hand, 1998; Henly, this volume).