eCommons

 

STRUCTURED AUTOMATIC DIFFERENTIATION

Other Titles

Author(s)

Abstract

Differentiation is one of the fundamental problems in numerical mathemetics. The solution of many optimization problems and other applications require knowledge of the gradient, the Jacobian matrix, or the Hessian matrix of a given function. Many large scale optimization applications (e.g., inverse problems) are very complex in nature. It becomes impractical to consider the function evaluation of such problems as a ``black-box'' function, since the computation is structured in some manner, going through a set of defined structured steps, i.e., problem structure. It pays to expose the problem structure in the computation to be able to compute the derivatives efficiently thus making the solution procedure practical. Automatic differentiation (AD) can compute fast and accurate derivatives of any degree computationally via propagating Taylor series coefficients using the chain rule. AD doesn't incur any truncation error and to compute the derivatives efficiently thus making the problem solution practical. would yield exact results if the calculations were done in real arithmetic; in other words the derivatives obtained are accurate to machine precision. This thesis is concerned with the efficient application of AD to large (and complex) optimization problems. The major theme is the structure exploitation of the user problem. We present methodologies which allow AD to exploit problem structure. An important idea is the exploitation of sparsity in the Jacobian matrices: We present a scheme which combines the forward and reverse modes of AD. Problem structure can be viewed in many different ways; one way is to look at the granularity of the operations involved. For example, differentiation carried out at the matrix-vector operations can lead to great savings in the time as well as space requirements. Figuring out the {\em kind} of computation is another way to view structure, e.g., partially separable or composite functions whose structur e can be exploited to get performance gains. In this thesis we develop a general structure framework which can be viewed hierarchically and allows for structure exploitation at various levels. For example, for time integration schemes employing stencils it is possible to exploit structure at both the stencil level and the timestep level. We also present some advanced structure exploitation ideas, e.g., parallelism in structured computations and using structure in implicit computations. The use of AD as a derivative computing engine naturally automates all the methodologies presented in this work -- we present ways to make the design of numerical optimization software very transparent, and the presentation of problems by the user as easy as possible.

Journal / Series

Volume & Issue

Description

Sponsorship

Date Issued

1998-05

Publisher

Cornell University

Keywords

computer science; technical report

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Committee Co-Chair

Committee Member

Degree Discipline

Degree Name

Degree Level

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR98-1681

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

technical report

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record