Tensor (Multidimensional Array) Decomposition, Regression And Software For Statistics And Machine Learning
This thesis illustrates connections between statistical models for tensors, introduces a novel linear model for tensors with 3 modes, and implements tensor software in the form of an R package. Tensors, or multidimensional arrays, are a natural generalization of the vectors and matrices that are ubiquitous in statistical modeling. However, while matrix algebra has been well-studied and plays a crucial role in the interaction between data and the parameters of any given model, algebra of higher-order arrays has been relatively overlooked in data analysis and statistical theory. The emergence of multilinear datasets - where observations are vector-variate, matrix-variate, or even tensor-variate - only serve to emphasize the relative lack of statistical understanding around tensor data structures. In the first half of the thesis, we highlight classic tensor algebraic results and models used in image analysis, chemometrics, and psychometrics, as well as connect them to recent statistical models. The second half of the thesis features a linear model that is based off a recently introduced tensor multiplication. For this model, we prove some of the classic properties that we would expect from a 3-tensor generalization of the matrix ordinary least squares. We also apply our model to a functional dataset to demonstrate one possible usage. We conclude this thesis with an exposition of the software developed to facilitate tensor modeling and manipulation in R. This software implements many of the classic tensor decomposition models as well as our own linear model.
tensor; multilinear; multidimensional; linear regression; tensor least squares; inference; machine learning; prediction
Wells, Martin Timothy
Booth, James; Bien, Jacob
Ph. D., Statistics
Doctor of Philosophy
dissertation or thesis