ABSTRACT

In this chapter, we explore methods for functions which are sparsely observed. Such data arise quite often in longitudinal studies in which researchers are only able to observe subjects at a relatively small number of time points, which can be different for different patients. For example, patients may arrive for diagnostic examinations only at a handful of irregularly and sparsely distributed time points. For this reason, we will refer to the methods in this section as Sparse Functional Data Analysis or S-FDA. An entire textbook could be devoted to this topic, thus in one chapter we will only be able to outline key methodologies and differences with methods for densely observed functions. In S-FDA, smoothing is not applied to individual sparse trajectories. Imputed smooth trajectories can be obtained only after information from the whole sample has been suitably combined. A distinguishing feature from approaches discussed in previous chapters is thus that one does not usually directly embed each unit into a function space. To do so could produce very unreliable curve estimates and potentially introduce a substantial amount of bias into the observations. Instead, most sparse FDA methods rely heavily on pooling 118across subjects and utilizing nonparametric smoothing, also called scatterplot smoothing or nonparametric regression.