Center for Advance Computing
http://hdl.handle.net/1813/569
2016-09-26T12:20:29ZAccurate Solution of Weighted Least Squares by Iterative Methods
http://hdl.handle.net/1813/5598
Accurate Solution of Weighted Least Squares by Iterative Methods
Bobrovnikova, Elena Y.; Vavasis, Stephen A.
We consider the weighted least-squares (WLS) problem with a very ill-conditioned weight matrix. Weighted least-squares problems arise in many applications including linear programming, electrical networks, boundary value problems, and structures. Because of roundoff errors, standard iterative methods for solving a WLS problem with ill-conditioned weights may not give the correct answer. Indeed, the difference between the true and computed solution (forward error) may be large. We propose an iterative algorithm, called MINRES-L, for solving WLS problems. The MINRES-L method is the application of MINRES, a Krylov-space method due to Paige and Saunders, to a certain layered linear system. Using a simplified model of the effects of round off error, we prove that MINRES-L gives answers with small forward error. We present computational experiments for some applications.
1997-02-06T00:00:00ZLocal correlation energies of two-electron atoms and model systems
http://hdl.handle.net/1813/5597
Local correlation energies of two-electron atoms and model systems
Huang, Chien-Jung; Umrigar, C.J.
We present nearly-local definitions of the correlation energy
density, and its potential and kinetic components, and evaluate them for several two-electron systems. This information should provide valuable guidance in constructing better correlation functionals than those in common use. In addition, we demonstrate that the quantum chemistry and the density functional definitions of the correlation energy rapidly approach one another with increasing atomic number.
1997-01-01T00:00:00ZQuality Mesh Generation in Higher Dimensions
http://hdl.handle.net/1813/5596
Quality Mesh Generation in Higher Dimensions
Mitchell, Scott A.; Vavasis, Stephen A.
We consider the problem of triangulating a d-dimensional region. Our mesh generation algorithm, called QMG, is a qradtree-based algorithm that can triangulate any polyhedral region including nonconvexregions with holes. Furthermore, our algorithm guarantees a bounded aspect ratio triangulation provided that the input domain itself has no sharp angles. Finally, our algorithm is guaranteed never to over refine the domain in the sense that the number of simplices produced by QMG is bounded above by a factor times the number produced by any competing algorithm, where the factor depends on the aspect ratio bound satisfied by the competing algorithm. The QMG algorithm has been implemented in C++ and is used as a mesh generator for the finite element method.
1996-12-01T00:00:00ZModel fermion Monte Carlo with correlated pairs II
http://hdl.handle.net/1813/5595
Model fermion Monte Carlo with correlated pairs II
Kalos, M.H.; Schmidt, K.E.
Correlated dynamics can produce stable algorithms for excited states of quantum many-body problems. We study a variety of harmonic oscillator problems to demonstrate the kinds of correlations needed. We show that marginally correct dynamics that produce a stable overlap with an antisymmetrictrial function give the correct fermion ground state.
1996-12-01T00:00:00ZSymmetry, Nonlinear Bifurcation Analysis, and Parallel Computation
http://hdl.handle.net/1813/5594
Symmetry, Nonlinear Bifurcation Analysis, and Parallel Computation
Wohlever, J.C.
In the natural and engineering sciences the equations which model physical systems with symmetry often exhibit an invariance with respect to a particular group "G" of linear transformations. "G" is typically a linear representation of a symmetry group "g" which characterizes the symmetry of the physical system. In this work, we will discuss the natural parallelism which arises while seeking families of solutions to a specific class of nonlinear vector equations which display a special type of group invariance, referred to as equivariance. The inherent parallelism stems for a global de-coupling, due to symmetry, of the full nonlinear equations which effectively splits the original problem into a set of smaller problems. Numerical results from asymmetry-adapted numerical procedure, (MMcontcm.m), written in MultiMATLAB are discussed.
1996-10-01T00:00:00ZMonte Carlo Optimization of Trial Wave Functions in Quantum Mechanics and Statistical Mechanics
http://hdl.handle.net/1813/5593
Monte Carlo Optimization of Trial Wave Functions in Quantum Mechanics and Statistical Mechanics
Nightingale, M.P.; Umrigar, C.J.
This review covers applications of quantum Monte Carlo methods to quantum mechanical problems in the study of electronic and atomic structure, as well as applications to statistical mechanical problems both of static and dynamic nature. The common thread in all these applications is optimization of many-parameter trial states, which is done by minimization of the variance of the local energy or, more generally for arbitrary eigenvalue problems, minimization of the variance of the configurational eigenvalue.
1996-10-01T00:00:00ZA critical assessment of the Self-Interaction Corrected Local Density Functional method and its algorithmic implementation
http://hdl.handle.net/1813/5592
A critical assessment of the Self-Interaction Corrected Local Density Functional method and its algorithmic implementation
Goedecker, S.; Umrigar, C.J.
We calculate the electronic structure of several atoms and small molecules by direct minimization of the Self-Interaction Corrected Local Density Approximation (SIC-LDA) functional. To do this we first derive an expression for the gradient of this functional under the constraint that the orbitals be orthogonal and show that previously given expressions do not correctly incorporate this constraint. In our atomic calculations the SIC-LDA yields total energies, ionization energies and charge densities that are superior to results obtained with the Local Density Approximation (LDA). However, for molecules SIC-LDA gives bond lengths and reaction energies that are inferior to those obtained from LDA. The nonlocal BLYP functional, which we include as a representative GGA functional, out performs both LDA and SIC-LDA forall ground state properties we considered.
1996-10-01T00:00:00ZEuropean Option Pricing with Fixed Transaction Costs
http://hdl.handle.net/1813/5591
European Option Pricing with Fixed Transaction Costs
Aiyer, Ajay Subramanian
In this paper, we study the problem of European option pricing in the presence of fixed transaction costs. The problems of optimal portfolio selection and option pricing in the presence of proportional transaction costs has been extensively studied in the mathematical finance literature. However, much less is known when we have fixed transaction costs. In this paper, we show that calculating the price of an European optioninvolves calculating the value functions of two stochastic impulse control problems and we obtain the explicit expressions for the resultant quasi-variational ine qualities satisfied by the value functions and then carry out a numerical calculation of the option price.
1996-09-01T00:00:00ZOptimal Portfolio Selection with Fixed Transactions Costs in the presence of Jumps and Random Drift
http://hdl.handle.net/1813/5590
Optimal Portfolio Selection with Fixed Transactions Costs in the presence of Jumps and Random Drift
Aiyer, Ajay Subramanian
In this paper, we study the general problem of optimal portfolio selection with fixed transactions costs in the presence of jumps. We extend the analysis of Morton and Pliska to this setting by modeling the return processes of the risky assets in the investor's portfolio as jump-diffusion processes and derive the expression for the related optimal stopping time problem of a Markov process with jumps and explicitly solve it in the situation when the portfolio consists only of one risky asset. We also provide an asymptotic analysis of our model with one risky asset following the ideas of Wilmott and Atkinson. In the process, we also obtain a solution for the "Merton problem" generalized to the situation when there is credit risk. Finally, we consider the case where the drift of the stockprice process is random and unobservable and obtain expressions for the optimal trading policies.
1996-09-01T00:00:00ZNon-normal Dynamics and Hydrodynamic Stability
http://hdl.handle.net/1813/5589
Non-normal Dynamics and Hydrodynamic Stability
Baggett, Jeffrey Scott
This thesis explores the interaction of non-normality and nonlinearity incontinuous dynamical systems. A solution beginning near a linearly stable fixed point may grow large by a linear mechanism, if the linearization is non-normal, until it is swept away by nonlinearities resulting in a much smaller basin of attraction than could possibly be predicted by the spectrum of the linearization. Exactly this situation occurs in certain linearly stable shear flows, where the linearization about the laminar flow may be highly non-normal leading to the transient growth of certain small disturbances by factors which scale with the Reynolds number. These issues are brought into focus in Chapter 1 through the study of atwo-dimensional model system of ordinary differential equations proposed by Trefethen, et al. [Science, 261, 1993].
In Chapter 2, two theorems are proved which show that the basin of attraction of a stable fixed point, in systems of differential equations combining a non-normal linear term with quadratic nonlinearities, can decrease rapidly as the degree of non-normality is increased, often faster than inverse linearly.
Several different low-dimensional models of transition to turbulence are examined in Chapter 3. These models were proposed by more than a dozen authors for a wide variety of reasons, but they all incorporate non-normal linear terms and quadratic nonlinearities. Surprisingly, in most cases, the basin of attraction of the "laminar flow" shrinks much faster than the inverse Reynolds number.
Transition to turbulence from optimally growing linear disturbances, streamwise vortices, is investigated in plane Poiseuille and plane Couette flows in Chapter4. An explanation is given for why smaller streamwise vortices can lead to turbulence in plane Poiseuille flow. In plane Poiseuille flow, the transient linear growth of streamwise streaks caused by non-normality leads directly to a secondary instability.
Certain unbounded operators are so non-normal that the evolution of infinitesimal perturbations to the fixed point is entirely unrelated to the spectrum, even as i to infinity. Two examples of this phenomenonare presented in Chapter 5.
1996-08-01T00:00:00ZStructure and Efficient Hessian Calculation
http://hdl.handle.net/1813/5588
Structure and Efficient Hessian Calculation
Coleman, Thomas F.; Verma, Arun
Modern methods for numerical optimization calculate (or approximate) the matrix of second derivatives, the Hessian matrix, at each iteration. The recent arrival of robust software for automatic differentiation allows for the possibility of automatically computing the Hessian matrix, and the gradient, given a code to evaluate the objective function itself. However, for large-scale problems direct application of automatic differentiation may be unacceptably expensive. Recent work has shown that this cost can be dramatically reduced in the presence of sparsity. In this paper we show that for structured problems it is possible to apply automatic differentiation tools in an economical way - even in the absence of sparsity in the Hessian.
1996-08-01T00:00:00ZStable and Efficient Solution of Weighted Least-Squares Problems with Applications in Interior Point Methods
http://hdl.handle.net/1813/5587
Stable and Efficient Solution of Weighted Least-Squares Problems with Applications in Interior Point Methods
Hough, Patricia D.
In this thesis, we consider two closely related problems. The first is a full-rank weighted least-squares problem with a weight matrix that is positive definite, diagonal, and extremely ill conditioned. The ill-conditioning can cause standard algorithms to compute solutions with, in some cases, no digits of accuracy. Theory suggests the existence of an algorithm that will compute an accurate solution despite the ill-conditioning in the weight matrix. We describe a new algorithm, the Complete Orthogonal Decomposition (COD) Algorithm, for solving the weighted least-squares problem and show that it has this desirable property. In addition, the COD Algorithm is based on standard, well-understood techniques and is straightforward to implement. A natural application for the weighted least-squares problem described in the previous paragraph is interior point methods for linear programming. We discuss the problem in this context, and describe how the COD algorithm can be extended and used in this setting. Unlike other algorithms, this one is stable for interior point methods without assuming nondegeneracy in the linear programming instance. Computational experiments indicate that it is more reliable than other algorithms when the problem is near degenerate. The second problem involves a particular interior point algorithm. In 1994, Vavasis and Ye proposed a new primal-dual path-following interior point method, the layered-step interior point (LIP) method. This algorithm interleaves traditional steps with longer, layered least-squares (LLS) steps. Computation of a LLS step requires solving a weighted least-squares problem similar to the one described above, but the weight matrix also has the property that the weights fall into well-separated groups. This additional structure allows the problem to be broken down into smaller, constrained problems with well-conditioned weight matrices. The smaller problems can then be solved stably with standard algorithms, and the LLS step can be computed. Vavasis and Ye did not propose a particular algorithm for solving the LLS problem. In this thesis, we present an algorithm based on Cholesky factorization. The algorithm is such that a modified version of the sparse Cholesky code of Ng and Peyton of Oak Ridge National Laboratories can be used. Thus, the theoretical results are straight forward, and this algorithm proves to be accurate and efficient in practice.
1996-08-01T00:00:00ZThe Procrustes Problem for Orthogonal Stiefel Matrices
http://hdl.handle.net/1813/5586
The Procrustes Problem for Orthogonal Stiefel Matrices
Bojanczyk, A.W.; Lutoborski, A.
(This abstract contains mathematical symbols that may not reproduce wellin ASCII text.) In this paper we consider the Procrustes problem on the manifold of orthogonal Stiefel matrices. That is, given matrices A epsilon R(m*k), B epsilon R(m*p), m greater tahn or equal to p greater than or equal to k, we seek the minimum of ||A - BQ||2 for all matrices Q epsilon R(p*k), QTQ = I(k*k). We introduce a class of relaxation methods for generating minimizing sequences and offer a geometric interpretation of these methods. Results of numerical experiments illustrating the convergence of the methods are given.
1996-08-01T00:00:00ZScalable Parallel Electronic Structure Calculations on the IBM SP2
http://hdl.handle.net/1813/5585
Scalable Parallel Electronic Structure Calculations on the IBM SP2
Goedecker, Stefan; Hoisie, Adolfy
We have developed a highly efficient and scalable electronic
structure code for parallel computers using message passing. The algorithm takes advantages of the natural parallelism in quantum chemistry problems to obtain very high performance even on a large number of processors. Most of the terms which scale cubically with respect to the number of atoms have been eliminated allowing the treatment of very large systems. It uses one of the most precise versions of Density Functional Theory, namely Self-Interaction Corrected Density Functional Theory.
1996-08-01T00:00:00ZOn the Efficient Methods to Solve ODEs and BVPs Using Automatic Differentiation
http://hdl.handle.net/1813/5584
On the Efficient Methods to Solve ODEs and BVPs Using Automatic Differentiation
Verma, Arun
A large number of physical phenomena are modeled by a system of ODEs
or a system of implicit ODEs. We demonstrate applicability of automatic differentiation (AD) for solving: (1) Boundary value problems in ODEs and implicit ODEs. (2) Initial state and parameter estimation problems. The impact of using AD is two fold. Firstly, efficient methods for computing the gradient vectors and Jacobian matrices have been developed using AD. Secondly the process of getting derivatives via AD is robust, more user friendly, and provides error free derivatives. Furthermore, techniques using AD have been developed which exploit structure in the user's computation, and particularly the structure we observe in boundary value problems or state/parameter estimation problems. We demonstrate by a few experiments the efficiency gained by the usage of AD in solving these problems.
1996-08-01T00:00:00ZPiecewise Differentiable Minimization for Ill-posed Inverse Problems
http://hdl.handle.net/1813/5583
Piecewise Differentiable Minimization for Ill-posed Inverse Problems
Li, Yuying
Based on minimizing a piece wise differentiable lp function subject
to a single inequality constraint, this paper discusses algorithms for a discretized regularization problem for ill-posed inverse problems. We examine computational challenges of solving this regularization problem. Possible minimization algorithms such as the steepest descent method, iteratively weighted least squares (IRLS) method and a recent globally convergent affine scaling Newton approach are considered. Limitations and efficiency of these algorithms are demonstrated using the geophysical travel time tomographic inversion and image restoration applications.
1996-08-01T00:00:00ZTransformation Techniques for Toeplitz and Toeplitz-plus-Hankel Matrices Part II. Algorithms
http://hdl.handle.net/1813/5582
Transformation Techniques for Toeplitz and Toeplitz-plus-Hankel Matrices Part II. Algorithms
Bojanczyk, A.W.; Heinig, George
In the first part of the paper transformations mapping Toeplitz
and Toeplitz-plus-Hankel matrices into generalized Cauchy matrices were studied. In this second part fast algorithms for LU-factorization and inversion of generalized Cauchy matrices are discussed. It is shown that the combination of transformation pivoting techniques leads to algorithms for indefinite Toeplitz and Toeplitz-plus-Hankel matrices that are more stable than the classical ones Special attention is paid to the symmetric and hermitian cases.
1996-08-01T00:00:00ZTransformation Techniques for Toeplitz and Toeplitz-plus-Hankel Matrices Part I. Transformations
http://hdl.handle.net/1813/5581
Transformation Techniques for Toeplitz and Toeplitz-plus-Hankel Matrices Part I. Transformations
Bojanczyk, A.W.; Heinig, George
Transformations of the form A to C1AC2 are investigated that
transform Toeplitz and Toeplitz-plus-Hankel matrices into generalized Cauchy matrices. C1 and C2 are matrices related to the discrete Fourier transformation or to various real trigonometric transformations. Combining these results with pivoting techniques, in part II algorithms for Toeplitz and Toeplitz-plus-Hankel systems will be presented that are more stable than classical algorithms.
1996-08-01T00:00:00ZGeneralized gradient approximations to density functional theory: comparison with exact results
http://hdl.handle.net/1813/5580
Generalized gradient approximations to density functional theory: comparison with exact results
Filippi, Claudia; Gonze, Xavier; Umrigar, C.J.
In order to assess the accuracy of commonly used approximate exchange-correlation density functionals, we present a comparison of accurate exchange and correlation potentials, exchange energy densities and energy components with the corresponding approximate quantities. Four systems are used as illustrative examples: the model system of two electrons in a harmonic potential and the De, Be and Ne atoms. A new ingredient in the paper is the separation of the exchange-correlation potential into exchange and correlation according to the density functional theory definition.
1996-06-01T00:00:00ZSeparation of the Exchange-Correlation Potential into Exchange plus Correlation: an Optimized Effective Potential Approach
http://hdl.handle.net/1813/5579
Separation of the Exchange-Correlation Potential into Exchange plus Correlation: an Optimized Effective Potential Approach
Filippi, Claudia; Umrigar, C.J.; Gonze, Xavier
Most approximate exchange-correlation functionals used within density functional theory are constructed as the sum of two distinct contributions for exchange and correlation. Separating the exchange component from the entire functional is useful since, for exchange, exact relations exist under uniform density scaling and spin scaling. In the past, accurate exchange-correlation potentials have been generated from essentially exact densities but they have not been correctly decomposed into their separate exchange and correlation components (except for two-electron systems). Using a recently proposed method, equivalent to the solution of an optimized effective potential problem with the corresponding orbitals replaced by the exact Kohn-Sham orbitals, we obtain the separation according to the density functional theory definition. We compare the results for the Ne and Be atoms with those obtained by the previously used appromixate separation scheme.
1996-06-01T00:00:00ZSelf-organized criticality, evolution, and extinction
http://hdl.handle.net/1813/5578
Self-organized criticality, evolution, and extinction
Newman, M.E.J.
Statistical analysis indicates that the fossil exctinction record is compatible with a distribution of extinction events whose frequency is related to their size by a power law with exponent tau approx. = 2. This result is in agreement with preductions based on self-organized critical models of extinction, and might well be taken as evidence for self-organizing behavior in terrestrial evolution. We argue however that there is a much simpler explanation for the appearance of a power law in terms of extinctions caused by stresses (either biotic or abiotic) to which species are subjected by their environment. We give an explicit model of this process and discussits properties and implications for the interpretation of the fossil record.
1996-06-01T00:00:00ZMatrix Iterations: The Six Gaps Between Potential Theory and Convergence
http://hdl.handle.net/1813/5577
Matrix Iterations: The Six Gaps Between Potential Theory and Convergence
Driscoll, Tobin A.; Toh, Kim-Chuan; Trefethen, Lloyd N.
The theory of the convergence of Krylov subspace iterations for linear systems of equations (conjugate gradients, biconjugate gradients, GMRES, QMR, Bi-CGSTAB, ...) is reviewed. For a computation of this kind, an estimated asymptotic convergence factor rho less than 1 can be derived by solving a problem of potential theory or conformal mapping. Six approximations are involved in reducing the actual computation to this scalar estimate. These six approximations are discussed in a systematic way and illustrated by a sequence of examples computed with tools of numerical conformal mapping and semidefinite programming.
1996-06-01T00:00:00ZWorkload Evolution on the Cornell Theory Center IBM SP2
http://hdl.handle.net/1813/5576
Workload Evolution on the Cornell Theory Center IBM SP2
Hotovy, Steve
The Cornell Theory Center (CTC) put a 512-node IBM SP2 system into production in early 1995, and extended traces of batch jobs began to be collected in June of that year. An analysis of the workload shows that it has not only grown, but that its characteristics have changed over time. In particular, job duration increased with time, indicative of an expanding production workload. In addition, there was increasing use of parallelism. As the load has increased and larger jobs have become more frequent, the batch management software (IBM's LoadLeveler) has had difficulty in scheduling the requested resources. New policies were established to improve the situation. This paper will profile how the workload has changed over time and give anin-depth look at the maturing workload. It will examine how frequently certain resources are requested and analyze user submittal patterns. It will also describe the policies that were implemented to improve the scheduling situation and their effect on the workload.
1996-06-01T00:00:00ZToward a Portable Parallel Library for Space-Time Adaptive Methods
http://hdl.handle.net/1813/5575
Toward a Portable Parallel Library for Space-Time Adaptive Methods
Lebak, James M.; Durie, Robert C.; Bojancyk, Adam W.
Space-time adaptive processing (STAP) refers to a class of methods for detecting targets using an array of sensors. The output of the array is weighted using data collected from the sensors over a given period of time. An optimal method of calculating weights exists; however, this method is usually computationally impractical. Therefore, various heuristic methods are used that approximate the optimal method. These heuristics use many of the same operations and are computationally demanding. We are in the process of constructing a portable, parallel library of subroutines useful for constructing STAP heuristics. As a first step in this process, we implemented one STAP heuristic, higher-order post-Doppler processing, using three different parallel methods on the IBM SP2 and the Intel Paragon: these methods characterize different parallel approaches to the STAP problem. From implementing these algorithms, we have been able to identify components for our parallel library. We propose models for some of the components and give preliminary timing results for the parallel methods.
1996-06-01T00:00:00ZDomain Decomposition Methods for Conformal Mapping and Eigenvalue Problems
http://hdl.handle.net/1813/5574
Domain Decomposition Methods for Conformal Mapping and Eigenvalue Problems
Driscoll, Tobin Allen
Domain decomposition is widely used in the numerical solution of elliptic boundary value problems. It is appealing in part because of improved efficiency and straightforward parallelization. Conceptually, domain decomposition often exploits a natural feature of elliptic problems: data at one point may have an exceedingly weak influence on the solution at a far-removed point. The application of domain decomposition to other elliptic problems is less fully developed. One such area is numerical conformal mapping. We introduce the SC Toolbox for Matlab, an interactive graphical software package for the Schwarz-Christoffel mapping of polygons. The Toolbox can be used for interior and exterior mapping from several fundamental domains. Elongations in the polygon lead to crowding, in which the preimages of affected vertices are exponentially close together. Such regions are candidates for decomposition. We describe CRDT, an overlapping subdomain method developed with Vavasis for numerical Schwarz-Christoffel mapping. The method uses Delaunay triangulation to decompose the polygon into overlapping quadrilaterals, which in turn define cross-ratios that form the basis of a nonlinear system. Each quadrilateral induces an embedding of the prevertices so that locally, the map can be computed accurately. Apparently CRDT can deal with any degree of crowding, as is demonstrated by examples. Another application in conformal mapping is in Symm's integral equation. An important feature of existing software for Symm's equation is the efficient treatment of corner singularities. Careful generalization to multiple domains allows this treatment to be preserved and extended. An onoverlapping formulation leads to a linear system that is ideal for Schur complementation. The resulting method asymptotically requires a fraction of the single-domain work and is easily parallelized. We also consider a domain decomposition algorithm for the Laplace eigenvalue problem on polygons. This method, an improvement on one described by Descloux and Tolley, searches for the matching of Fourier-Bessel expansions at each corner to locate eigenvalues. We apply the algorithm to the "isospectral drums" discovered by Gordon, Webb, and Wolpert to find 25 eigenvalues to 12 digits. The method is far more accurate and efficient than standard methods for this problem.
1996-05-01T00:00:00ZThe Chebyshev Polynomials of a Matrix
http://hdl.handle.net/1813/5573
The Chebyshev Polynomials of a Matrix
Toh, Kim-Chuan; Trefethen, Lloyd N.
A Chebyshev polynomial of a square matrix A is a monic polynomial of specified degree that minimizes ||p(A)||(sub2). The study of such polynomials is motivated by the analysis of Krylov subspace iterations in numerical linear algebra. An algorithm is presented for computing these polynomials based on reduction to a semidefinite program which is then solved by a primal-dual interior point method. Examples of Chebyshev polynomials of matrices are presented, and it is noted that if A is far from normal, the lemniscates of these polynomials tend to approximate pseudospectra of A.
1996-05-01T00:00:00ZMultiMATLAB: MATLAB on Multiple Processors
http://hdl.handle.net/1813/5572
MultiMATLAB: MATLAB on Multiple Processors
Trefethen, Anne E.; Menon, Vijay S.; Chang, Chi-Chao; Czajkowski, Grezgorz J.; Myers, Chris; Trefethen, Lloyd N.
MATLAB(R), a commercial product of The MathWorks, Inc., has become one of the principal languages of desktop scientific computing. A system is described that enables one to run MATLAB conveniently on multiple processors. Using short, MATLAB-style commands like Eval, Send, Recv, Bcast, Min, and Sum, the user operating within one MATLAB session can start various processes in a fashion that maintains MATLAB's traditional user-friendliness. Multi-processor graphics is also supported. The system currently runs under MPICH on an IBM SP2 or a network of Unix workstations, and extensions are planned to networks of PCs. MultiMATLAB is potentially useful for education in parallel programming, for prototyping parallel algorithms, and for fast and convenient execution of easily parallelizable numerical computations on multiple processors.
1996-05-01T00:00:00ZStructure and Efficient Jacobian Calculation
http://hdl.handle.net/1813/5571
Structure and Efficient Jacobian Calculation
Coleman, Thomas F.; Verma, Arun
Many computational tasks require the determination of the Jacobian matrix, at a given argument, for a large nonlinear system of equations. Calculation or approximation of a Newton step is a related task. The development of robust automatic differentiation (AD) software allows for "painless" and accurate calculation of these quantities; however, straight forward application of AD software on large-scale problems can require an inordinate amount of computation. Fortunately, large-scale systems of nonlinear equations typically exhibit either sparsity or structure in their Jacobian matrices. In this paper we proffer general approaches for exploiting sparsity and structure to yield efficient ways to determine Jacobian matrices (and Newton steps) via automatic differentiation.
1996-03-01T00:00:00ZAvalanches, Scaling, and Coherent Noise
http://hdl.handle.net/1813/5570
Avalanches, Scaling, and Coherent Noise
Newman, M.E.J.; Sneppen, Kim
We present a simple model of a dynamical system driven slowly by externally-imposed coherent noise. Although the system never becomes critical in the sense of possessing spatial correlations of arbitrarily long range, it does organize into a stationary state characterized by avalanches with a universal power-law size distribution. We explain the behavior of the model within a time-averaged approximation, and discuss its connection to the dynamics of earthquakes, the Gutenberg-Richter law, and to recent experiments on avalanches in rice piles.
1996-03-01T00:00:00ZLow-dimensional models of subcritical transition to turbulence
http://hdl.handle.net/1813/5569
Low-dimensional models of subcritical transition to turbulence
Baggett, Jeffrey S.; Trefethen, Lloyd N.
In the past five years, working largely independently, five groups of researchers have proposed low-dimensional models of the behavior of parallel shear flows at high Reynolds numbers. These models are compared, and it is found that they are more similar than their authors have recognized. Among other similarities, most of them exhibit a threshold amplitude c=O(R**alpha) as R to infinity for some alpha less than -1, where R is the Reynolds number, for perturbations of the laminar state that may excite transition to turbulence.
1996-02-01T00:00:00ZAnalysis of the Early Workload on the Cornell Theory Center IBM SP2
http://hdl.handle.net/1813/5568
Analysis of the Early Workload on the Cornell Theory Center IBM SP2
Hotovy, Steven; Schneider, David; O'Donnell, Timothy
Parallel computers have matured to the point where they are capable of running a significant production workload. Characterizing this workload, however, is far more complicated than for the single-processor case. Besides the varying number of processors that may be invoked, the nodes themselves may provide differing computational resources (memory size, for example). In addition, the batch schedulers may introduce further categories of service which must be considered in the analysis. The Cornell Theory Center (CTC) put a 512-node IBM SP2 system into production in early 1995. Extended traces of batch jobs began to be collected in mid-1995 when the usage base became sufficiently large. This paper offers an analysis of this early batch workload.
1996-01-01T00:00:00ZNumerical Conformal Mapping Using Cross-ratios and Delaunay Triangulation
http://hdl.handle.net/1813/5567
Numerical Conformal Mapping Using Cross-ratios and Delaunay Triangulation
Driscoll, Tobin; Vavasis, Stephen A.
We propose a new algorithm for computing the Riemann mapping of the unit disk to a polygon, also known as the Schwarz-Christoffel transformation. The new algorithm, CRDT, is based on cross-ratios of the prevertices, and also on cross-ratios of quadrilaterals in a Delaunay triangulation of the polygon. The CRDT algorithm produces an accurate representation of the Riemann mapping even in the presence of arbitrary long, thin regions in the polygon, unlike any previous conformal mapping algorithm. We believe that CRDT can never fail to converge to the correct Riemann mapping, but the correctness and convergence proof depend on conjectures that we have so far not been able to prove. We demonstrate convergence with computational experiments. The Riemann mapping has applications to problems in two-dimensional potential theory and to finite-difference mesh generation. We use CRDT to produce a mapping and solve a boundary value problem on long, thin regions for which no other algorithm can solve these problems.
1996-02-01T00:00:00ZMulticonfiguration Wavefunctions for Quantum Monte Carlo Calculations of First-row Diatomic Molecules
http://hdl.handle.net/1813/5566
Multiconfiguration Wavefunctions for Quantum Monte Carlo Calculations of First-row Diatomic Molecules
Filippi, Claudia; Umrigar, C.J.
We use the variance minimization method to determine accurate wavefunctions for first-row homonuclear diatomic molecules. The form of the wave function is a product of a sum of determinants and a generalized Jastrow factor. One of the important features of the calculation is that we are including low-lying determinants corresponding to single and double excitations from the Hartree-Fock configuration within the space of orbitals whose atomic principal quantum numbers do not exceed those occurring in the Hartree-Fock configuration. The idea is that near-degeneracy correlation is most effectively described by a linear combination of low-lying determinants whereas dynamic correlation is well described by the generalized Jastrow factor. All the parameters occurring in both the determinantal and the Jastrow parts of the wave function are optimized. The optimized wave functions recover 77-94% of the correlation energy in variational Monte Carlo and 91-99% of the correlation energy in diffusion Monte Carlo.
1996-01-01T00:00:00ZLocality-Conscious Load Balancing: Connectionist Architectural Support
http://hdl.handle.net/1813/5565
Locality-Conscious Load Balancing: Connectionist Architectural Support
Tumuluri, Chaitanya; Choudhary, Alok N.; Mohan, Chilukuri K.
Traditionally, in distributed memory architectures, locality maintenance and load balancing are seen as user level activities involving compiler and runtime system support in software. Such software solutions require an explicit phase of execution, requiring the application to suspend its activities. This paper presents the first (to our knowledge) architecture-level scheme for extracting locality concurrent with the application execution. An artificial neural network coprocessor is used for dynamically monitoring processor reference streams to learn temporally emergent utilities of data elements in ongoing local computations. This facilitates use of kernel-level load balancing schemes thus, easing the user programming burden. The kernel-level scheme migrates data to processor memories evincing higher utilities during load-balancing. The performance of an execution-driven simulation evaluating the proposed coprocessor is presented for three applications. The applications chosen represent the range of load and locality fluxes encounted in parallel programs, with (a) static locality and load characteristics, (b) slowly varying localities for fixed datasetsizes and (c) rapidly fluctuating localities among slowly varying datasetsizes. The performance results indicate the viability and success of the coprocessor in concurrently extracting locality for use in load balancing activities.
1996-01-01T00:00:00ZSolving Unconstrained Discrete-time Optimal Control Problems Using Trust Region Method
http://hdl.handle.net/1813/5564
Solving Unconstrained Discrete-time Optimal Control Problems Using Trust Region Method
Liao, Aiping
Trust region method for a class of large-scale minimization problems,
the unconstrained discrete-time optimal control (DTOC) problems, is considered. Although the trust region algorithms developed in [4] and [13] are very economical they lack the ability to handle the so-called hard case. In this paper, we show that the trust region subproblem can be solved within an acceptable accuracy without forming the Hessian explicitly. The new approach is based on the inverse power method for eigenvalue problem and possesses the ability to handle the hard case. Our proposed approach leads to more efficient algorithms for DTOC problems.
1995-12-01T00:00:00ZA Comparison of Optimization Heuristics for the Data Mapping Problem
http://hdl.handle.net/1813/5563
A Comparison of Optimization Heuristics for the Data Mapping Problem
Chrisochoides, Nikos; Mansour, Nashat; Fox, Geoffrey
In this paper we compare the performance of six heuristics with
suboptimal solutions for the data mapping problem of two dimensional meshes that are used for the numerical solution of Partial Differential Equations(PDEs) on multicomputers. The data mapping heuristics are evaluated with respect to seven criteria covering load balancing, interprocessor communication,flexibility and ease of use for a class of single-phase iterative PDE solvers. Our evaluation suggests that the simple and fast block distribution heuristic can be as effective as the other five complex and computationally expensive algorithms.
1995-12-01T00:00:00ZDealing with Dense Rows in the Solution of Sparse Linear Least Squares Problems
http://hdl.handle.net/1813/5562
Dealing with Dense Rows in the Solution of Sparse Linear Least Squares Problems
Sun, Chunguang
Sparse linear least squares problems containing a few relatively dense rows occur frequently in practice. Straightforward solution of these problems could cause catastrophic fill and delivers extremely poor performance. This paper studies a scheme for solving such problems efficiently by handling dense rows and sparse rows separately. How a sparse matrix is partitioned into dense rows and sparse rows determines the efficiency of the overall solution process. A new algorithm is proposed to find a partition of a sparse matrix which leads to satisfactory or even optimal performance. Extensive numerical experiments are performed to demonstrate the effectiveness of the proposed scheme. A MATLAB implementation is included.
1995-12-01T00:00:00ZPseudospectra of Linear Operators
http://hdl.handle.net/1813/5561
Pseudospectra of Linear Operators
Trefethen, Lloyd N.
The following contains mathematical formulae and symbols that may become distorted in ASCII text format. The advent of ever more powerful computers has brought with it a new way of conceiving some of the fundamental eigenvalue problems of applied mathematics. If a matrix or linear operator "A" is far from normal, its eigenvalues or more generally its spectrum may have little to do with its behavor as measured by quantities such as ||A**N|| or ||exp(tA)||. More may be learned by examining the sets in the complex plane known as the "pseudospectra" of A, defined by level curves of the norm of the resolvent, ||(zI - A)**-1||. Five years ago, the author published a paper that presented computed pseudospectra of thirteen highly non-normal matrices arising in various applications. Since that time, analogous computations have been carried out for differential and integral operators. This paper, a companion to the earlier one, presents ten examples, each chosen to illustrate one or more mathematical or physical principles.
1995-12-01T00:00:00ZThe Efficient Computation of Sparse Jacobian Matrices Using Automatic Differentiation
http://hdl.handle.net/1813/5560
The Efficient Computation of Sparse Jacobian Matrices Using Automatic Differentiation
Coleman, Thomas F.; Verma, Arun
This paper is concerned with the efficient computation of sparse Jacobian matrices of nonlinear vector maps using automatic differentiation (AD). Specifically, we propose the use of a graph coloring technique, bi-coloring, to exploit the sparsity of the Jacobian matrix J and thereby allow for the efficient determination of J using AD software. We analyze both a direct scheme and a substitution process. We discuss the results of numerical experiments indicating significant practical potential of this approach.
1995-12-01T00:00:00ZA Newton Acceleration of the Weiszfeld Algorithm for Minimizing the Sum of Euclidean Distances
http://hdl.handle.net/1813/5559
A Newton Acceleration of the Weiszfeld Algorithm for Minimizing the Sum of Euclidean Distances
Li, Yuying
The Weiszfeld algorithm for continuous location problems can be considered as an iteratively reweighted least squares method. It exhibits linear convergence. In this paper, a Newton type algorithm with similar simplicity is proposed to solve a continuous multifacility location problem with Euclidean distance measure. Similar to the Weiszfeld algorithm, at each iteration the main computation can be solving a weighted least squares problem. A Cholesky factorization of a symmetric positive definite band matrix, typically with a relatively small band width (e.g., a band width of two for a Euclidean location problem on a plane) is required. This new algorithm can be regarded as a Newton acceleration to the Weiszfeld algorithm with fast global and local convergence. The simplicity and efficiency of the proposed algorithm makes it particularly suitable for large-scale Euclidean location problems and parallel implementation. Computational experience also suggests that the proposed algorithm performs remarkably well in the presence of degeneracy and near degeneracy. In addition, it is proven to be globally convergent. Although the local convergence analysis is still under investigation, computation results suggest that it is typically superlinearly convergent.
1995-11-01T00:00:00ZAn Aspect Ratio Bound for Triangulating a d-grid Cut by a Hyperplane
http://hdl.handle.net/1813/5558
An Aspect Ratio Bound for Triangulating a d-grid Cut by a Hyperplane
Mitchell, Scott A.; Vavasis, Stephen A.
We consider the problem of triangulating a d-dimensional uniform grid of d-cubes that is cut by a k-dimensional affine subspace. The goal is to obtain a triangulation with bounded aspect ratio. To achieve this goal, we allow some of the box faces near the affine subspace to be displaced. This problem has applications to finite element mesh generation. For general d and k, the bound on aspect ratio that we attain is double-exponential in d. For the important special case of d = 3, the aspect ratio bound is small enough that the technique is useful in practice.
1995-11-01T00:00:00ZMultithreaded model for dynamic load balancing parallel adaptive PDE computations
http://hdl.handle.net/1813/5557
Multithreaded model for dynamic load balancing parallel adaptive PDE computations
Chrisochoides, Nikos
We present a multithreaded model for the dynamic load-balancing ofnumerical, adaptive computations required for the solution of Partial Differential Equations (PDEs) on multiprocessors. Multithreading is used as a means of exploring concurrency in the processor level inorder to tolerate synchronization costs inherent to traditional (non-threaded) parallel adaptive PDE solvers. Our preliminary analysis for parallel, adaptive PDE solvers indicates that multithreading can be used as a mechanism to mask overheads required for the dynamic balancing of processor workloads with computations required for the actual numerical solution of the PDEs. Also, multithreading can simplify the implementation of dynamic load-balancing algorithms, a task that is very difficult for traditional data parallel adaptive PDE computations. Unfortunately, multithreading does not always simplify program complexity, often makes code re-usability not an easy task, and increases software complexity.
1995-10-01T00:00:00ZA Model for Evolution and Extinction
http://hdl.handle.net/1813/5556
A Model for Evolution and Extinction
Roberts, Bruce W.; Newman, M.E.J.
We present a model for evolution and extinction in large ecosystems. The model incorporates the effects of interactions between species and the influences of abiotic environmental factors. We study the properties of the model by approximate analytic solution and also by numerical simulation, and use it to make predictions about the distribution of extinctions and species lifetimes that we would expect to see in real ecosystems. It should be possible to test these predictions against the fossil record. The model indicates that a possible mechanism for mass extinction is the coincidence of a large coevolutionary avalanche in the ecosystem with a severe environmental disturbance.
1995-08-01T00:00:00ZA Method for Imaging Corrosion Damage in Thin Plates from Electrostatic Data
http://hdl.handle.net/1813/5555
A Method for Imaging Corrosion Damage in Thin Plates from Electrostatic Data
Kaup, Peter G.; Santosa, Fadil; Vogelius, Michael
The problem of quantitative nondestructive evaluation of corrosion in plates is considered. The inpection method uses boundary measurements of currents and voltages to determine the material loss caused by corrosion. The development of the method is based on linearization and the assumption that the plate is thin. The behavior of the method is examined in numerical situations.
1995-08-01T00:00:00ZMonte Carlo Study of the Random-field Ising Model
http://hdl.handle.net/1813/5554
Monte Carlo Study of the Random-field Ising Model
Newman, M.E.J.; Barkema, G.T.
Using a cluster-flipping Monte Carlo algorithm combined with a generalization of the histogram reweighting scheme of Ferrenberg and Swendsen, we have studied the equilibrium properties of the thermal random-field Ising model on a cubic lattice in three dimensions. We have equilibrated systems of L x L x L spins, with values of L up to 32, and for these systems the cluster-flipping method appears to a large extent to overcome the slow equilibration seen in single-spin-flip methods. From the results of our simulations we have extracted values for the critical exponents and the critical temperature and randomnessof the model by finite size scaling. For the exponents we find v=1.02 +- 0.06, B=0.06 +- 0.07, y=1.9 +- 0.2, and mean(y)=2.9 +- 0.2.
1995-07-01T00:00:00ZA Subspace, Interior, and Conjugate Gradient Method for Large-scale Bound-constrained Minimization Problems
http://hdl.handle.net/1813/5553
A Subspace, Interior, and Conjugate Gradient Method for Large-scale Bound-constrained Minimization Problems
Branch, Mary Ann; Coleman, Thomas F.; Li, Yuying
A subspace adaption of the Coleman-Li trust region and interior method is proposed for solving large-scale bound-constrained minimization problems. This method can be implemented with either sparse Cholesky factorization or conjugate gradient computation. Under reasonable conditions the convergence properties of this subspace trust region method are as strong as those of its full-space version. Computational performance on various large-scale test problems are reported; advantages of our approach are demonstrated. Our experience indicates our proposed method represents an efficient way to solve large-scalebound-constrained minimization problems.
1995-07-01T00:00:00ZNonlinear Wave Equations for Relativity
http://hdl.handle.net/1813/5552
Nonlinear Wave Equations for Relativity
van Putten, Maurice H.P.M.; Eardley, Douglas M.
Gravitational radiation is described by canonical Yang-Mills wave equations on the curved space-time manifold, together with evolution equations for the metric in the tangent bundle. The initial data problem is described in Yang-Mills scalar and vector potentials, resulting in Lie-constraints in addition to the familiar Gauss-Codacci relations
1995-07-01T00:00:00ZAutomatic Optimization
http://hdl.handle.net/1813/5551
Automatic Optimization
Liao, Aiping
We propose some automatic techniques for unconstrained optimization with the objective function given by some computer program. We show that the Newton step can be calculated in O(m2) operations where m is the number of stages in the function evaluation program. We also show that methods developed in Coleman and Liao [1] and Liao [8] for unconstrained discrete-time optimal control problems can be modified to handle general cases.
1995-07-01T00:00:00ZFast Compiled Logic Simulation Using Linear BDDs
http://hdl.handle.net/1813/5550
Fast Compiled Logic Simulation Using Linear BDDs
Gupta, Sudeep; Pingali, Keshav
This paper presents a new technique for compiled zero delay logic simulation, and includes extensive experiments that demonstrate its performance on standard benchmarks. Our compiler partitions the circuit into fanout-freeregions (FFRs), transforms each FFR into a linear sized BDD, and converts each BDD into executable code. In our approach, the computation is sublinear in the number of variables within each partition because only one path, from root to leaf, of the BDD is executed; therefore in many cases, substantial computation is avoided. In this way, our approach gets dome to the advantages of oblivious as well as demand-driven evaluation. We investigated the impact of the various heuristics on performance, and based on this data, we recommend good values for design parameters. A performance improvement of up to 67% over oblivious simulation is observed for our benchmarks.
1995-06-01T00:00:00ZModel Fermion Monte Carlo with Correlated Pairs
http://hdl.handle.net/1813/5549
Model Fermion Monte Carlo with Correlated Pairs
Kalos, Malvin H.
The issues that prevent the development of efficient and stable algorithms for fermion Monte Carlo in continuum systems are reexamined with special reference to the implications of the "plus/minus" symmetry. This is a property of many algorithms that use signed walkers, namely that the dynamics are unchanged when the signs of the walkers are interchanged. Algorithms that obey this symmetry cannot exhibit the necessary stability. Specifically, estimates of the overlap with any antisymmetric test function cannot be bounded away from zero in the limit of many iterations. Within the framework of a diffusion Monte Carlo treatment of the Schroedinger equation, it is shown that this symmetry is easily broken for pairs of walkers while at the same time preserving the correct marginal dynamics for each member of the pair. The key is to create different classes of correlations between members of pairs and to use (at least) two distinct correlations for a pair and for the same pair withthe signs exchanged. The ideas are applied successfully for a class of simplemodel problems in two dimensions.
1995-06-01T00:00:00Z