Machine Science: Automated Modeling Of Deterministic And Stochastic Dynamical Systems

Other Titles
Abstract

The work presented here advances the technology to analyze experimental data and automatically hypothesize about explanatory models and physical laws that help explain observations. Automated Modeling, sometimes referred to as Symbolic Regression or System Identification, is the process of searching a possibly infinite space of mathematical expressions in order to optimize various objectives - for example, identifying the simplest possible nonlinear equation that captures the observed dynamics of a system. Traditionally, the task of formulating analytical models and theory has remained entirely within the purview of human expertise, and also human limitation. However, the development of Evolutionary Algorithms, and more recently Genetic Programming, has made searching for analytical models automatically a possibility. The work presented here focuses on advancing the algorithms and techniques for Automated Modeling to shrink this "reality gap," and applies these advances to various real and experimental systems for the first time. The specific contributions of this work fall into four categories: search methods and algorithms, model representations and the types of systems that can be analyzed, techniques for interpreting solutions and results, and applications in science and engineering fields. The most important contribution in the search methods is the Fitness and Rank Prediction algorithm, which enables utilizing exceedingly large data sets with low computational effort. This algorithm is based on the idea that, at any given time, only a small number of carefully selected data points are necessary to discriminate among candidate models, allowing large reductions in computational effort. In model representations, the most important contribution is the principle for identifying meaningful invariant quantities amongst the infinite number of trivial invariant expressions. This principle enables searching for physical laws and conservations directly from experimental measurements. In the interpretation of results, the most important contribution is Parameter Mapping technique, which relates an automatically inferred model to a previous model through repeated regressions. Finally, the most important contribution in applications is the analysis of yeast Glycolytic oscillations, which demonstrates and compares several techniques in order to identify a complete nonlinear ordinary differential equation model directly from data.

Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
2011-01-31
Publisher
Keywords
Machine Science; Artificial Intelligence; Evolutionary Computation
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Lipson, Hod
Committee Co-Chair
Committee Member
Ellner, Stephen Paul
Strogatz, Steven H
Degree Discipline
Computational Biology
Degree Name
Ph. D., Computational Biology
Degree Level
Doctor of Philosophy
Related Version
Related DOI
Related To
Related Part
Based on Related Item
Has Other Format(s)
Part of Related Item
Related To
Related Publication(s)
Link(s) to Related Publication(s)
References
Link(s) to Reference(s)
Previously Published As
Government Document
ISBN
ISMN
ISSN
Other Identifiers
Rights
Rights URI
Types
dissertation or thesis
Accessibility Feature
Accessibility Hazard
Accessibility Summary
Link(s) to Catalog Record