Contributions to Functional Data Analysis




Journal Title

Journal ISSN

Volume Title



Functional data consist of repeated measurements taken over time for each subject. The data for a subject are assumed to be values of a random function that is observed at a discrete time points rather than a sequence of individual measurements. Functional data are classified dense or sparse based on whether the time points are frequent and regularly spaced or infrequent and irregularly spaced. The usual longitudinal data are an example of sparse functional data. Functional data are increasing common in biomedical applications and their analysis is currently an active area of research in statistics. This dissertation makes two contributions to functional data analysis. The first contribution is development of a methodology for modeling and analysis of functional data arising in method comparison studies. The observed data in this application consist of repeated measurements of a continuous variable obtained using multiple methods of measurement on a sample of subjects. The data are treated as multivariate functional data that are observed with noise at a common set of discrete time points which may vary from subject to subject. The proposed methodology uses functional principal components analysis within the framework of a mixed-effects model to represent the observations in terms of a small number of method-specific principal components. Two approaches for estimating the unknowns in the model, both adaptations of general techniques developed for multivariate functional principal components analysis, are presented. Bootstrapping is employed to get estimates of bias and covariance matrix of model parameter estimates. These in turn are used to compute confidence intervals for parameters and functions thereof, such as the measures of similarity and agreement between the measurement methods, that are necessary for data analysis. The second contribution is development of a methodology for constructing tolerance bands for two non-Gaussian members of exponential family: binomial and Poisson. The approach is to first model the data using the framework of generalized functional principal components analysis. Then, a parameter is identified in which the marginal distribution of the response is stochastically monotone. It is shown that the tolerance limits can be readily obtained from the confidence limits of this parameter, which in turn can be computed using standard large-sample theory and bootstrapping. Both methodologies work with dense as well as sparse functional data. Simulation studies are conducted to evaluate their performance and get recommendations for practical applications. They are illustrated by analyzing real biomedical datasets. Computer programs are provided for their implementation.



Bootstrap (Statistics), Functional analysis, Multilevel models (Statistics), Poisson distribution, Regression analysis, Stochastic processes