From Homogeneous To Heterogeneous: Statistical 3-D Signal Reconstruction Of Macromolecular Complexes
The structure and function of biological macromolecular complexes is currently a topic of great interest in biology. The primary contribution of this thesis is the mathematical description of a specific problem in this area of biology, the development of algorithms and software to solve the problem, and the demonstration of the relevance of the solution to biology. The biology problem is to describe the three-dimensional structural heterogeneity of biological macromolecular complexes. The data is single-particle cryo electron microscopy images of individual instances of the complex and therefore the data contains information concerning the heterogeneity of the complex, although the information is usually ignored. Each image is a noisy 2-D projection of the 3-D electron scattering intensity of the particle modified by the electron optics of the microscope. This thesis focuses on developing statistical models, estimators for the parameters in the models, algorithms for determining the estimates, and computational implementations using high performance computing of the algorithms and demonstrates these results on biological problems where the complex is a virus. The problem is treated as a stochastic signal in noise problem with the goal of estimating the statistics of the signal by a maximum likelihood estimator. The signal model includes both discrete and continuous heterogeneity, specifically, within each class of the discrete heterogeneity, the continuous heterogeneity is described as Gaussian with unknown mean and covariance. The unknown a priori class probabilities and the unknown mean and covariance for each class are estimated by a maximum likelihood estimator which is solved by a generalized expectation-maximization algorithm which is implemented in parallel software. The software is demonstrated on experimental images from multiple types of viruses. Previously known biological results are reproduced and novel biological results are determined. Different complexes have different spatial symmetry groups. Most of the work presented in this thesis concerns complexes that are roughly spherical in shape and especially the subset of such complexes which have icosahedral symmetry. The remainder of the work concerns complexes which have helical symmetry. Evaluating the estimators requires substantial amounts of computation. Various algorithmic and software improvements to reduce computation are presented. For a fixed amount of computation, such improvements enable the achievement of higher spatial resolution in the estimated electron scattering intensity which will enable novel biological discoveries.
Tong, Lang; Tang, Ao; Johnson, John E
Ph. D., Electrical Engineering
Doctor of Philosophy
dissertation or thesis