ORIE Technical Reports
http://hdl.handle.net/1813/3942
2017-09-26T00:05:46ZExtreme value analysis without the largest values: what can be done?
http://hdl.handle.net/1813/47968
Extreme value analysis without the largest values: what can be done?
Zou, Jingjing; Davis, Richard; Samorodnitsky, Gennady
In this paper we are concerned with the analysis of heavy-tailed data when a
portion of the extreme values are unavailable. This research was motivated by an
analysis of the degree distributions in a large social network. The degree distributions
of such networks tend to have power law behavior in the tails. We focus on the Hill
estimator, which plays a starring role in heavy-tailed modeling. The Hill estimator for
this data exhibited a smooth and increasing “sample path” as a function of the number
of upper order statistics used in constructing the estimator. This behavior became
more apparent as we artificially removed more of the upper order statistics. Building
on this observation, we introduce a new parameterization into the Hill estimator
that corresponds to the proportion of
extreme values that are unavailable and the proportion of upper order statistics used
in the estimation. We establish functional convergence of the
normalized Hill estimator to a Gaussian random field. An estimation procedure is
developed based on the limit theory to estimate the number of missing extremes and
extreme value parameters including the tail index and the bias of Hill’s estimate. We
illustrate how this approach works in both simulations and real data examples.
2017-01-01T00:00:00ZExtremal theory for long range dependent infinitely divisible processes
http://hdl.handle.net/1813/47543
Extremal theory for long range dependent infinitely divisible processes
Samorodnitskty, Gennady; Wang, Yizao
We prove limit theorems of an entirely new type for certain long memory
regularly varying stationary \id\ random processes. These theorems
involve multiple
phase transitions governed by how long the memory is. Apart from one
regime, our results exhibit limits that are not among the
classical extreme value distributions. Restricted to the one-dimensional
case, the distributions we obtain interpolate, in the appropriate
parameter range, the $\alpha$-Fr\'echet distribution and the skewed
$\alpha$-stable distribution. In general, the limit is a new family
of stationary and self-similar random sup-measures with parameters
$\alpha\in(0,\infty)$ and $\beta\in(0,1)$, with
representations based on intersections of independent $\beta$-stable regenerative sets. The tail of the limit random sup-measure on each interval with finite positive length is regularly varying with index $-\alpha$.
The intriguing structure of these random sup-measures is due to
intersections of independent $\beta$-stable regenerative sets and the
fact that the number of such sets intersecting simultaneously increases to infinity as $\beta$ increases to one.
The results in this paper extend substantially previous investigations where only $\alpha\in(0,2)$ and $\beta\in(0,1/2)$ have been considered.
2017-03-01T00:00:00ZDistance covariance for stochastic processes
http://hdl.handle.net/1813/45412
Distance covariance for stochastic processes
Matsui, Muneya; Mikosch, Thomas; Samorodnitsky, Gennady
The distance covariance of two random vectors is a measure of their dependence.
The empirical distance covariance and correlation can be used as statistical tools for
testing whether two random vectors are independent.
We propose an analogs
of the distance covariance for two stochastic processes defined on some interval. Their empirical
analogs can be used to test the independence of two processes.
2016-12-13T00:00:00ZClosed-form Ruin Probabilities in Classical Risk Models with Gamma Claims
http://hdl.handle.net/1813/45410
Closed-form Ruin Probabilities in Classical Risk Models with Gamma Claims
Constantinescu, Corina; Samorodnitsky, Gennady; Zhu, Wei
In this paper we provide three equivalent expressions for ruin probabilities in a Cram\'er-Lundberg model with gamma distributed claims. The results are solutions of integro-differential equations, derived by means of (inverse) Laplace transforms. All the three formulas have infinite series forms, two involving Mittag-Leffler functions and the third one involving moments of the claims distribution. This last result applies to any other claim size distributions that exhibits finite moments.
CC and WZ would like to acknowledge the European Union's Seventh Framework Programme for research, technological development and demonstration, Marie Curie grant no 318984, on Risk Analysis, Ruin and Extremes - RARE. GS would like to acknowledge the ARO MURI grant W911NF-12-1-0385
2016-12-12T00:00:00ZExtreme values of the uniform autoregressive processes and missing observations
http://hdl.handle.net/1813/44459
Extreme values of the uniform autoregressive processes and missing observations
Glavas, Lenka; Mladenovic, Pavle; Samorodnitsky, Gennady
We investigate partial maxima of the uniform AR(1) processes
with parameter r geq 2. Positively and negatively correlated processes are considered.
New limit theorems for maxima in complete and incomplete samples are obtained
2016-01-01T00:00:00ZAsymptotic behaviour of Gaussian minima
http://hdl.handle.net/1813/43886
Asymptotic behaviour of Gaussian minima
Chakrabarty, Arijit; Samorodnitsky, Gennady
We investigate what happens when an entire sample path of a smooth
Gaussian process on a compact interval lies above a high
level. Specifically, we determine the precise asymptotic probability
of such an event, the extent to which the high level is exceeded,
the conditional shape of the process above the high level, and the
location of the minimum of the process given that the sample path is
above a high level.
2016-01-01T00:00:00ZContact distribution in a thinned Boolean model with power law radii
http://hdl.handle.net/1813/43140
Contact distribution in a thinned Boolean model with power law radii
Dong, Yinghua; Samorodnitsky, Gennady
We consider a weighted stationary spherical Boolean
model in R^d. Assuming that the radii of the balls in the Boolean
model have regularly varying tails, we establish the asymptotic
behaviour of the tail
of the contact distribution of the thinned germ-grain model under 4
different thinning procedures of the original model.
2016-03-01T00:00:00ZMultivariate Subexponential Distributions and Their Applications
http://hdl.handle.net/1813/40837
Multivariate Subexponential Distributions and Their Applications
Samorodnitsky, Gennady; Sun, Julian
We propose a new definition of a multivariate subexponential
distribution. We compare this definition with the two existing
notions of multivariate subexponentiality, and compute the
asymptotic behaviour of the ruin probability in the context of an
insurance portfolio, when multivariate subexponentiality
holds. Previously such results were available only in the case of
multivariate regularly varying claims.
2015-09-16T00:00:00ZFul lling Orders in a Multi-Echelon Capacitated On-line Retail System: PART TWO, real-time purchasing and ful llment decision making
http://hdl.handle.net/1813/40146
Ful lling Orders in a Multi-Echelon Capacitated On-line Retail System: PART TWO, real-time purchasing and ful llment decision making
Li, Juan; Muckstadt, John
When ful lling customer orders, on-line retailers must operate their multi-warehouse systems with great
care to ensure that these orders are satis ed in a timely and cost e ective manner. We worked closely with
a major on-line retailer to design an e ective and e cient ful llment system. This included establishing
policies and procedures for ordering, receiving, storing and shipping of goods. Internal warehousing and
transportation practices were addressed, and new approaches for managing inventories were established. In
this paper we focus on one type of inventory management problem faced by the company when making
daily purchasing and allocation decisions. These decisions are of two types, the positioning of inventories in
their multi-echelon system and the detailed manner in which they use inventories to ful ll speci c customer
orders. After reviewing some of the key attributes of models that address the two types of decision problems,
we present a computationally tractable approach for solving these problems for a system that must ful ll
many hundreds of thousands of orders daily.
2015-05-13T00:00:00ZAsymptotic Normality of Degree Counts in a Preferential Attachment Model
http://hdl.handle.net/1813/39933
Asymptotic Normality of Degree Counts in a Preferential Attachment Model
Resnick, Sidney; Samorodnitsky, Gennady
Preferential attachment is a widely adopted paradigm for understanding
the dynamics of
social networks. Formal statistical inference,
for instance GLM techniques, and model
verification methods will require knowing test statistics are asymptotically
normal even though node or count based
network data is nothing like classical data from
independently replicated experiments. We therefore study asymptotic
normality of degree counts for a sequence of growing simple undirected
preferential attachment graphs. The methods of proof rely on
identifying martingales and then exploiting the martingale central
limit theorems.
2015-04-28T00:00:00ZFunctional central limit theorem for negatively dependent heavy-tailed stationary infinitely divisible processes generated by conservative flows
http://hdl.handle.net/1813/39288
Functional central limit theorem for negatively dependent heavy-tailed stationary infinitely divisible processes generated by conservative flows
Jung, Paul; Owada, Takashi; Samorodnitsky, Gennady
We prove a functional central limit theorem for
partial sums of symmetric stationary long range dependent heavy tailed
infinitely divisible processes with a certain type of negative
dependence. Previously only positive dependence could be treated. The
negative dependence involves cancellations of the Gaussian second
order. This
leads to new types of {limiting} processes involving stable random
measures, due to heavy tails, Mittag-Leffler processes, due to long
memory, and Brownian motions, due to the Gaussian second order
cancellations.
2015-04-06T00:00:00ZMulti-Period Stock Allocation Via Robust Optimization
http://hdl.handle.net/1813/39275
Multi-Period Stock Allocation Via Robust Optimization
Jackson, Peter; Muckstadt, John
In this paper we re-visit a long-standing multi-echelon inventory al-location problem from a robust optimization perspective. We formulate the problem as a one warehouse, N-retailer, multi-period, stock allocation problem in which holding costs are identical at each location and
no stock is received from outside suppliers for the duration of the planning horizon. Stock may be transferred from the central warehouse to
the retailers instantaneously and without cost at the beginning of each
period for which the central warehouse still has stock on hand. No other
stock transfers are allowed. Under this set-up, the only motive for holding inventory at the central warehouse for allocation in future periods is
the so-called risk-pooling motive. The dynamic programming formulation
of this problem requires a state space too large for practical computation. Various approximation methods have been proposed for variants of
this problem. We apply robust optimization to this problem extending
the typical uncertainty set to capture the risk pooling phenomenon and
extending the inventory policy to allow for an adaptive, non-anticipatory
shipment policy. We show how to represent the uncertainty set compactly
so that it grows by no more than the square of the number of retailers.
The problem can be solved using Benders decomposition in the general
case. In the special case of no initial retailer inventories, two periods, and
identical retailers, a relaxed form of the problem admits a closed form
solution with surprising insights. Summarizing the experimental results
of the paper, we see both confirmation of the value of the robust optimization approach as well as managerial insights into the design and operation
of multi-echelon inventory systems.
2015-03-25T00:00:00ZClimbing down Gaussian peaks
http://hdl.handle.net/1813/39077
Climbing down Gaussian peaks
Adler, Robert; Samorodnitsky, Gennady
How likely is the high level of a continuous Gaussian random field on
an Euclidean space to
have a ``hole'' of a certain dimension and depth? Questions of this
type are difficult, but in this paper we make progress on questions
shedding new light in existence of holes. How likely is the field to
be above a high level on one compact set (e.g. a sphere) and to be
below a fraction of that level on some other compact set, e.g. at the
center of the corresponding ball? How likely is the field to be below that
fraction of the level anywhere nside the ball? We work on the
level of large deviations.
2015-01-28T00:00:00ZNumerical Validation of Fill Rate Estimation Methods for Two- and Three-Demand Class Rationing Policies with One-for-One Replenishment and General Lead Time Distributions
http://hdl.handle.net/1813/39019
Numerical Validation of Fill Rate Estimation Methods for Two- and Three-Demand Class Rationing Policies with One-for-One Replenishment and General Lead Time Distributions
Vicil, Oguzhan; Jackson, Peter
In this report, we conduct numerical simulations of two- and three-demand class inventory threshold rationing systems under one-for-one replenishment policies. The performance metrics of interest are the fill rates of the high priority demand classes (the gold fill rate in the two demand
class system and the platinum and gold fill rates in the three-demand class system).
Our main interest is in the sensitivity of these fill rates to the form of the replenishment lead time probability distribution and the resulting quality of approximation methods used to estimate
these fill rates. We consider three approximation methods: what we call the single cycle approach attributed to Dekker et al and Deshpande et al, the embedded Markov chain approach of Fadigloglu and Bulut, and the continuous time Markov chain approach of Vicil and Jackson.
We confirm the superiority of the embedded Markov chain approach for the case of constant lead times but we find that the fill rates are relatively insensitive to the form of the lead time distribution and both latter approaches, the embedded Markov chain approach and the continuous time Markov chain approach, perform well over wide ranges of lead time variability. For the
three-demand class system, we demonstrate that it is possible to achieve highly differentiated fill rates by demand class and show that these fill rates can be estimated with high accuracy using the continuous time Markov chain approach, provided the fill rate of the lowest priority demand class (the silver fill rate) is not too low.
2015-01-08T00:00:00ZTime-changed extremal process as a random sup measure
http://hdl.handle.net/1813/37941
Time-changed extremal process as a random sup measure
Lacaux, Céline; Samorodnitsky, Gennady
A functional limit theorem for the partial maxima of a long memory
stable sequence produces a limiting process that can be described as a
beta-power time change in the classical Fr\'echet
extremal process, for beta in a subinterval of the unit
interval. Any such power time change in the extremal process
for 0<beta<1 produces a process with stationary
max-increments. This deceptively simple time change hides the much
more delicate structure of the resulting process as a self-affine
random sup measure. We uncover this structure and show that in a
certain range of the parameters this random measure arises as a limit
of the partial maxima of the same long memory stable sequence, but in
a different space. These results open a way to construct a whole new
class of self-similar Fr\'echet processes with stationary
max-increments.
2014-10-09T00:00:00ZTauberian Theory for Multivariate Regularly Varying Distributions with Application to Preferential Attachment Networks
http://hdl.handle.net/1813/36712
Tauberian Theory for Multivariate Regularly Varying Distributions with Application to Preferential Attachment Networks
Resnick, Sidney; Samorodnitsky, Gennady
Abel-Tauberian theorems relate power
law behavior of distributions and their transforms. We formulate and
prove a multivariate version for non-standard regularly varying
measures on R_+^p and then apply it to
prove that the joint distribution of in- and out-degree in a directed edge
preferential attachement model has jointly regularly varying
tails.
2014-06-25T00:00:00ZNonstandard regular variation of the in-degree and the out-degree in the preferential attachement model
http://hdl.handle.net/1813/36711
Nonstandard regular variation of the in-degree and the out-degree in the preferential attachement model
Samorodnitsky, Gennady; Resnick, Sidney; Towsley, Don; Davis, Richard; Willis, Amy; Wan, Phyllis
For the directed edge preferential attachment network growth model
studied by Bollobas et al. (2003) and
Krapivsky and Redner (2001), we prove that the joint distribution of
in-degree and
out-degree
has jointly regularly varying
tails.
Typically the marginal tails of the in-degree distribution and the out-degree
distribution have different regular variation indices and so the joint
regular variation is non-standard.
Only marginal regular variation has been
previously established for this distribution in the cases where the
marginal tail indices are different.
2014-06-25T00:00:00ZGeneral inverse problems for regular variation
http://hdl.handle.net/1813/34411
General inverse problems for regular variation
Damek, Ewa; Mikosch, Thomas; Rosinski, Jan; Samorodnitsky, Gennady
Regular variation of distributional tails is known to be preserved by
various linear transformations of some random structures.
An inverse problem for regular
variation aims at understanding whether the regular variation of a
transformed random object is caused by regular variation of components
of the original random structure. In this paper we build up on previous
work and derive results in the multivariate case and in
situations where regular variation is
not restricted to one particular direction or quadrant.
2013-10-02T00:00:00ZStock Optimization in Emergency Resupply Networks under Stuttering Poisson Demand
http://hdl.handle.net/1813/33186
Stock Optimization in Emergency Resupply Networks under Stuttering Poisson Demand
Chen, J; Jackson, P.L.; Muckstadt, J
We consider a network in which field stocking locations (FSLs) manage multiple parts according
to an (S-1,S) policy. Demand processes for the parts are assumed to be independent
stuttering Poisson processes. Regular replenishments to an FSL occur from a regional stocking
location (RSL) that has an unlimited supply of each part type. Demand in excess
of supply at an FSL is routed to an emergency stocking location (ESL), which also employs
an (S-1,S) policy to manage its inventory. Demand in excess of supply at the ESL is backordered.
Lead time from the ESL to each FSL is assumed to be negligible compared to the
RSL-ESL resupply time. In companion papers we have shown how to approximate the joint
probability distributions of units on hand, units in regular resupply, and units in emergency
resupply. In this paper, we focus on the problem of determining the stock levels at the FSLs
and ESL across all part numbers that minimize backorder, and emergency resupply costs
subject to an inventory investment budget constraint. The problem is shown to be a nonconvex integer programming problem, and we explore a collection of heuristics for solving
the optimization problem.
2013-04-01T00:00:00ZCalculation of ruin probabilities for a dense class of heavy tailed distributions
http://hdl.handle.net/1813/31540
Calculation of ruin probabilities for a dense class of heavy tailed distributions
Bladt, Mogens; Nielsen, Bo Friis; Samorodnitsky, Gennady
In this paper we propose a
class of infinite--dimensional phase--type distributions with
finitely many parameters as models for heavy tailed distributions.
The class of finite--dimensional distributions is dense in the class
of distributions on the positive reals and may hence approximate any
such distribution.
We
prove that formulas from renewal theory, and with a particular
attention to ruin probabilities, which are true for common
phase--type distributions also hold true for the
infinite--dimensional case. We provide algorithms for
calculating functionals of interest such as the renewal density and
the ruin probability. It might be of interest to approximate a given
heavy--tailed distribution of some other type by a distribution from
the class of infinite--dimensional phase--type distributions and to
this end we provide a calibration procedure which works for the
approximation of distributions with a slowly varying tail. An
example from risk theory, comparing ruin probabilities for a
classical risk process with Pareto distributed claim sizes, is
presented and exact known ruin probabilities for the Pareto case are
compared to the ones obtained by approximating by an
infinite--dimensional hyper--exponential distribution.
2013-03-05T00:00:00ZMultivariate tail measure and the estimation of CoVar
http://hdl.handle.net/1813/30442
Multivariate tail measure and the estimation of CoVar
Nguyen, Tilo; Samorodnitsky, Gennady
The quality of estimation of multivariate tails depends significantly
on the portion of the sample included in the estimation. A simple
approach involving sequential statistical testing is proposed in order
to select which observations should be used for estimation of the tail
and spectral measures. We prove that the estimator is consistent. We
test the proposed method on simulated data, and subsequently apply it
to analyze CoVar for stock and index returns.
2012-10-09T00:00:00ZFunctional Central Limit Theorem for Heavy Tailed Stationary Infinitely Divisible Processes Generated by Conservative Flows
http://hdl.handle.net/1813/29996
Functional Central Limit Theorem for Heavy Tailed Stationary Infinitely Divisible Processes Generated by Conservative Flows
Owada, Takashi; Samorodnitsky, Gennady
We establish a new class of functional central limit theorems for
partial sum of certain symmetric stationary infinitely divisible processes with
regularly varying Levy measures. The limit process is a new class of
symmetric stable self-similar processes with stationary increments,
that coincides on a part of its parameter space with a previously
described process. The normalizing sequence and the limiting process
are determined by the ergodic theoretical properties of the flow
underlying the integral representation of the process. These
properties can be interpreted as determining how long is the memory of
the stationary infinitely divisible process. We also
establish functional convergence, in a strong distributional sense,
for conservative pointwise dual ergodic maps preserving an infinite
measure.
2012-09-18T00:00:00ZIntrinsic location functionals of stationary processes
http://hdl.handle.net/1813/29085
Intrinsic location functionals of stationary processes
Samorodnitsky, Gennady; Shen, Yi
We consider a large family of measurable functionals of the sample
path of a stochastic process over compact intervals (including first
hitting times,
leftmost location of the supremum, etc.) we call intrinsic location
functionals. Despite the large variety of these functionals and their
different nature, we show that for stationary processes
the distribution of any intrinsic location functional over an interval
is absolute continuous in the interior of the interval, and the
density functions always have a version satisfying
the same total variation constraints. Conversely, these total
variation constraints are shown to actually characterize stationarity
of the underlying stochastic process. We also show that
the possible distributions of the intrinsic location functionals over
an interval form a weakly closed convex set and describe its extreme
points, and present applications of this description.
2012-06-21T00:00:00ZOn the existence of paths between points in high level excursion sets of Gaussian random fields
http://hdl.handle.net/1813/28637
On the existence of paths between points in high level excursion sets of Gaussian random fields
Adler, Robert; Moldavskaya, Elina; Samorodnitsky, Gennady
The structure of Gaussian random fields over high levels is a well researched and
well understood area, particularly if the field is smooth. However, the question as to whether or not two or more points
which lie in an
excursion set belong to the same connected component has constantly eluded analysis. We study this problem from the point of view of large deviations, finding the asymptotic
probabilities that two such points are connected by a path laying within the excursion set,
and so belong to the same component. In addition, we obtain a characterization and descriptions of the
most likely paths, given that one exists.
2012-03-27T00:00:00ZFractional moments of solutions to stochastic recurrence equations
http://hdl.handle.net/1813/28598
Fractional moments of solutions to stochastic recurrence equations
Mikosch, Thomas; Samorodnitsky, Gennady; Tafakori, Laleh
In this paper we study the fractional moments of the stationary
solution to a stochastic recursion.
We derive recursive formulas for the fractional moments of the solution. Special attention is given to the case when
the additive term has an Erlang distribution. We provide various approximations to the
moments and show their performance in a small numerical study
2012-03-13T00:00:00ZLatent factor regression models for grouped outcomes
http://hdl.handle.net/1813/28480
Latent factor regression models for grouped outcomes
Woodard, Dawn; Love, Tanzy; Thurston, Sally; Ruppert, David; Sathyanarayana, Sheela; Swan, Shanna
We consider models for the effect of exposure on multiple outcomes, where the outcomes are nested in domains. We show that random effect models for this nested situation fit into a standard factor model framework, which leads us to view the modeling options as a spectrum between parsimonious random effect multiple outcomes models and more general continuous latent factor models. We introduce a set of models along this spectrum that extend an existing random effect model for multiple outcomes nested in domains. We characterize the tradeoffs between parsimony and flexibility in this set of models, applying them to both simulated data and data relating phthalate exposure to infant anthropometry.
2012-01-31T00:00:00ZWeak weak quenched limits for the path-valued processes of hitting times and positions of a transient, one-dimensional random walk in a random environment
http://hdl.handle.net/1813/28221
Weak weak quenched limits for the path-valued processes of hitting times and positions of a transient, one-dimensional random walk in a random environment
Peterson, Jonathon; Samorodnitsky, Gennady
In this article we continue the study of the quenched distributions of
transient, one-dimensional random walks in a random environment. In a
previous article we showed that while the quenched distributions of
the hitting times do not converge to any deterministic distribution,
they do have a weak weak limit in the sense that - viewed as random elements of the space of probability measures - they converge in distribution to a certain random probability measure (we refer to this as a weak weak limit because it is a weak limit in the weak topology).
Here, we improve this result to the path-valued process of hitting
times. As a consequence, we are able to also prove a weak weak quenched
limit theorem for the path of the random walk itself.
2011-12-15T00:00:00ZDistribution of the supremum location of stationary processes
http://hdl.handle.net/1813/24397
Distribution of the supremum location of stationary processes
Samorodnitsky, Gennady; Shen, Yi
The location of the unique supremum of
a stationary process on an interval does not need to be uniformly
distributed over that interval. We describe all possible distributions
of the supremum
location for a broad class of such stationary processes. We show that,
in the strongly mixing case, this distribution does tend to the
uniform in a certain sense as the length of the interval increases to
infinity.
2011-10-07T00:00:00ZIs the location of the supremum of a stationary process nearly uniformly distributed?'
http://hdl.handle.net/1813/24396
Is the location of the supremum of a stationary process nearly uniformly distributed?'
Samorodnitsky, Gennady; Shen, Yi
It is, perhaps, surprising that the location of the unique supremum of
a stationary process on an interval can fail to be uniformly
distributed over that interval. We show that this distribution is
absolutely continuous in the interior of the interval and describe very
specific conditions the density has to satisfy. We establish universal
upper bounds on the density and demonstrate their optimality.
2011-10-07T00:00:00ZMean-variance hedging with oil futures
http://hdl.handle.net/1813/23567
Mean-variance hedging with oil futures
Wang, Liao; Wissel, Johannes
We analyze mean-variance-optimal dynamic hedging strategies in oil producers and consumers. In a model for the oil spot and futures market with Gaussian convenience yield curves and a stochastic market price of risk, we find analytical solutions for the optimal trading strategies. An implementation of our strategies in an out-of-sample test on market data shows that the hedging strategies improve long-term return risk profiles of both the producer and the consumer.
2011-08-29T00:00:00ZMulti-product Separation Result for Inventory Management Under Inflation Risk
http://hdl.handle.net/1813/23357
Multi-product Separation Result for Inventory Management Under Inflation Risk
Sun, Yuemeng; Wissel, Johannes; Jackson, Peter
2011-07-25T00:00:00ZTravel time estimation for ambulances using Bayesian data augmentation
http://hdl.handle.net/1813/23135
Travel time estimation for ambulances using Bayesian data augmentation
Westgate, B. S.; Woodard, D. B.; Matteson, D. S.; Henderson, S. G.
Estimates of ambulance travel times on road networks are critical for effective ambulance
base placement and real-time ambulance dispatching. We introduce new methods for estimating the distribution of travel times on each road segment in a city, using Global Positioning System (GPS) data recorded during ambulance trips. Our preferred method uses a Bayesian model of the ambulance trips and GPS data. Due to sparseness and error in the GPS data, the exact ambulance paths and travel times on each road segment are unknown. To estimate the travel time distributions using the GPS data, we must also estimate each ambulance path. This is called the map-matching problem. We consider the unknown paths and travel times to be missing data, and simultaneously estimate them and the parameters of each road segment
travel time distribution using Bayesian data augmentation. We also introduce two alternative estimation methods using GPS speed data that are simple to implement in practice.
We test the predictive accuracy of the three methods on a subregion of Toronto, using
simulated data and data from Toronto EMS. All three methods perform well. Point estimates of ambulance trip durations from the Bayesian method outperform estimates from the alternative methods by roughly 5% in root mean squared error. Interval estimates from the Bayesian method for the Toronto EMS data are substantially better than interval estimates from the alternative methods. Map-matching estimates from the Bayesian method are robust to large GPS location errors, and interpolate well between widely spaced GPS points.
2011-07-08T00:00:00ZA Robust Robust Optimization Result
http://hdl.handle.net/1813/22809
A Robust Robust Optimization Result
Gancarova, Martina; Todd, Michael
2011-04-29T00:00:00ZTail Inference: where does the tail begin?
http://hdl.handle.net/1813/22505
Tail Inference: where does the tail begin?
Nguyen, Tilo; Samorodnitsky, Gennady
The quality of estimation of tail parameters, such as tail index in
the univariate case, or the spectral measure in the multivariate case,
depends crucially on the part of the sample included in the estimation.
A simple approach involving sequential statistical testing is proposed
in order to choose this part of the sample. This method can be used
both in the univariate and multivariate cases. It is computationally
efficient, and can be easily automated. No visual inspection of the
data is required. We establish consistency of the Hill estimator when
used in conjunction with the proposed method, as well describe its
asymptotic fluctuations. We compare our method to existing methods in
univariate and multivariate tail estimation, and use it to analyze
Danish fire insurance data.
2011-03-31T00:00:00ZStationarity of Generalized Autoregressive Moving Average Models
http://hdl.handle.net/1813/22495
Stationarity of Generalized Autoregressive Moving Average Models
Woodard, Dawn; Matteson, David; Henderson, Shane
Time series models are often constructed by combining nonstationary effects such as trends with stochastic processes that are believed to be stationary. Although stationarity of the underlying process is typically crucial to ensure desirable properties or even validity of statistical estimators, there are numerous time series models for which this stationarity is not yet proven. A major barrier is that the most commonly-used methods assume phi-irreducibility, a condition that can be violated for the important class of discrete-valued observation-driven models. We show (strict) stationarity for the class of Generalized Autoregressive Moving Average
(GARMA) models, which provides a flexible analogue of ARMA models for count, binary, or
other discrete-valued data. We do this from two perspectives. First, we show stationarity
and ergodicity of a perturbed version of the GARMA model, and show that the perturbed model yields parameter estimates that are arbitrarily close to those of the original model. This approach utilizes the fact that the perturbed model is phi-irreducible. Second, we show that the original GARMA model has a unique stationary distribution (so is strictly stationary when initialized in that distribution).
2010-10-28T00:00:00ZHierarchical adaptive regression kernels for regression with functional predictors
http://hdl.handle.net/1813/21991
Hierarchical adaptive regression kernels for regression with functional predictors
Woodard, Dawn B.; Crainiceanu, Ciprian; Ruppert, David
We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for
data that exhibit features such as spikes, dips, and plateaus whose frequency, location,
size, and shape varies across subjects. We propose full Bayesian inference of the joint
functional and exposure models, and give a method for efficient computation.
We contrast our approach with existing state-of-the-art methods for regression with
functional predictors, and show that our method is more effective and efficient for data that include features occurring at varying locations. We apply our methodology to a large and complex dataset from the Sleep Heart Health Study, in order to better
understand the relationship between sleep characteristics and health outcomes.
2011-01-11T00:00:00ZLower bounds on the convergence rates of adaptive MCMC methods
http://hdl.handle.net/1813/21990
Lower bounds on the convergence rates of adaptive MCMC methods
Schmidler, Scott; Woodard, Dawn B.
We consider the convergence properties of recently proposed adaptive Markov chain Monte Carlo (MCMC) algorithms for approximation of high-dimensional integrals arising in Bayesian analysis and statistical mechanics. Despite their name, in the general case these algorithms produce non-Markovian, time-inhomogeneous, irreversible stochastic processes. Nevertheless, we show that lower bounds on
the mixing times of these processes can be obtained using familiar ideas of hitting times and conductance from the theory of reversible Markov chains. While loose in some cases, the bounds obtained are sufficient to demonstrate slow mixing of several recently proposed algorithms including the adaptive Metropolis algorithm of Haario et al. (2001), the equi-energy sampler (Kou et al., 2006), and the importance-resampling MCMC algorithm (Atchad´e, 2009b) on some multimodal target distributions including mixtures of normal distributions and the mean-field Potts model. These results appear to be the first non-trivial bounds on the mixing times of adaptive MCMC samplers, and suggest that the adaptive methods considered may not provide qualitative improvements in mixing over the simpler Markov chain algorithms on which they are based. Our bounds also indicate properties which adaptive MCMC algorithms must have to achieve exponential speed-ups, suggesting directions for further research in
these methods.
2011-01-11T00:00:00ZConvergence rate of Markov chain methods for genomic motif discovery
http://hdl.handle.net/1813/21989
Convergence rate of Markov chain methods for genomic motif discovery
Woodard, Dawn B.
We analyze the convergence rate of a popular Gibbs sampling method used for statistical discovery of gene regulatory binding motifs in DNA sequences. This sampler satisfies a very strong form of ergodicity (uniform). However, we show that, due to multimodality of the posterior distribution, the rate of convergence often decreases exponentially as a function of the length of the DNA sequence. Specifically, we show that this occurs whenever there is more than one true repeating pattern in the data. In practice there are typically multiple, even numerous, such patterns in biological data, the goal being to detect the most well-conserved and frequently-occurring of these. Our findings match empirical results, in which the motif-discovery Gibbs sampler has exhibited such poor convergence that it is used only for finding modes of the posterior distribution (candidate motifs) rather than for obtaining samples from that distribution. Ours appear to be the first meaningful bounds on the convergence rate of a Markov chain method for sampling from a multimodal posterior distribution, as a function of statistical quantities like the number of observations.
2011-01-11T00:00:00ZWeak quenched limiting distributions for transient one-dimensional random walk in a random environment
http://hdl.handle.net/1813/19321
Weak quenched limiting distributions for transient one-dimensional random walk in a random environment
Peterson, Jonathon; Samorodnitsky, Gennady
We consider a one-dimensional, transient random walk in a random
i.i.d. environment. The asymptotic behaviour of such
random walk depends to a large extent on a crucial parameter
kappa>0 that determines the fluctuations of the process.
When 0<kappa<2, the averaged distributions of the hitting times of the random walk converge to a kappa-stable distribution. However, it was shown recently that in this case there does not exist a quenched limiting distribution of the hitting times.
That is, it is not true that
for almost every fixed environment, the
distributions of the hitting times (centered and scaled in any manner)
converge to a non-degenerate distribution. We
show, however, that the quenched distributions do have a limit in the
weak sense. That is, the quenched distributions of the hitting times
%of the random walk
-- viewed as a random probability measure on R -- converge in
distribution to a random probability measure, which has interesting
stability properties. Our results generalize both the averaged limiting distribution and the non-existence of quenched limiting distributions.
2010-12-07T00:00:00ZStationarity of Count-Valued and Nonlinear Time Series Models
http://hdl.handle.net/1813/17801
Stationarity of Count-Valued and Nonlinear Time Series Models
Woodard, Dawn; Matteson, David; Henderson, Shane
Time series models are often constructed by combining nonstationary effects such as trends with stochastic processes that are believed to be stationary. Although stationarity of the underlying process is typically crucial to ensure desirable properties or even validity of statistical
estimators, there are numerous time series models for which this stationarity is not yet proven. One of the most general methods for proving stationarity is via the use of drift conditions; however, this method assumes phi-irreducibility, which is violated by the important class of count-valued observation-driven models. We provide a formal justification for the use of drift
conditions on count-valued observation-driven models, and demonstrate by proving for the first time stationarity and ergodicity of several models. These include the class of Generalized Autoregressive Moving Average models, which contains a number of important count-valued and nonlinear models as special cases.
2010-10-28T00:00:00ZLarge deviations for Minkowski sums of heavy-tailed generally non-convex random compact sets
http://hdl.handle.net/1813/17317
Large deviations for Minkowski sums of heavy-tailed generally non-convex random compact sets
Mikosch, Thomas; Pawlas, Zbynek; Samorodnitsky, Gennady
We prove large deviation results for Minkowski sums of iid random
compact sets where we assume
that the summands have a regularly varying distribution. The
result confirms the heavy-tailed large deviation heuristics:
``large'' values of the sum are essentially due to the ``largest''
summand.
2010-08-16T14:36:00ZA large deviation principle for Minkowski sums of heavy-tailed random compact convex sets with finite expectation
http://hdl.handle.net/1813/17316
A large deviation principle for Minkowski sums of heavy-tailed random compact convex sets with finite expectation
Mikosch, Thomas; Pawlas, Zbynek; Samorodnitsky, Gennady
We prove large deviation results for Minkowski sums of iid random
compact sets where we assume
that the summands have a regularly varying distribution and
finite expectation. The main focus is on random convex compact sets. The
results confirm the heavy-tailed large deviation heuristics:
``large'' values of the sum are essentially due to the ``largest''
summand. These results extend those in Mikosch et al. (2011) for generally
non-convex sets, where we assumed that the normalization of the sum
grows faster than the number of terms.
2010-08-16T13:27:58ZOn the implementation and usage of SDPT3 - a MATLAB software package for semidefinite-quadratic-linear programming, version 4.0
http://hdl.handle.net/1813/15133
On the implementation and usage of SDPT3 - a MATLAB software package for semidefinite-quadratic-linear programming, version 4.0
Toh, Kim-Chuan; Todd, Michael J.; Tutuncu, Reha H.
This software is designed to solve primal and dual semide¯nite-quadratic-linear conic programming problems (known as SQLP problems) whose constraint cone is a product of semide¯nite cones, second-order cones, nonnegative orthants and Euclidean spaces, and whose objective function is the sum of linear functions and log-barrier terms associated with the constraint cones. This includes the special case of determinant maximization problems with linear matrix inequalities. It employs an infeasible primal-dual predictor-corrector path-following method, with either the
HKM or the NT search direction. The basic code is written in Matlab, but key
subroutines in C are incorporated via Mex ¯les. Routines are provided to read in
problems in either SDPA or SeDuMi format. Sparsity and block diagonal structure are exploited. We also exploit low-rank structures in the constraint matrices associ-
ated with the semide¯nite blocks if such structures are explicitly given. To help the
users in using our software, we also include some examples to illustrate the coding of
problem data for our solver. Various techniques to improve the eficiency and robustness of the main solver are incorporated. For example, step-lengths associated with semide¯nite cones are calculated via the Lanczos method. The current version also implements algorithms for solving a 3-parameter homogeneous self-dual model of the primal and dual SQLP problems. Routines are also provided to determine whether the primal and dual feasible regions of a given SQLP have empty interiors. Numerical experiments show that this general-purpose code can solve more than 80% of a total of about 430 test problems to an accuracy of at least 10¡6 in relative duality gap and infeasibilities.
2010-06-21T21:15:42ZEstablishing Stationarity of Time Series Models via Drift Criteria
http://hdl.handle.net/1813/14988
Establishing Stationarity of Time Series Models via Drift Criteria
Woodard, Dawn; Matteson, David; Henderson, Shane
Time series models are often constructed by combining nonstationary effects such as trends with stochastic processes that are known (or believed) to be stationary. However, there are numerous time series models for which the stationarity of the underlying process is conjectured but not yet proven. We give an approachable introduction to the use of drift criteria (also known as Lyapunov function techniques) for establishing strict stationarity and ergodicity of such models. These conditions immediately imply consistent estimation of the mean and lagged covariances, and more generally the expectation of any integrable function. We demonstrate by proving stationarity and ergodicity for several novel and useful examples, including Poisson log-link Generalized Autoregressive Moving Average models.
ORIE Technical Report 1477
2010-05-06T20:05:29ZFunction-indexed empirical processes based on an infinite source Poisson transmission stream
http://hdl.handle.net/1813/14725
Function-indexed empirical processes based on an infinite source Poisson transmission stream
Roueff, Francois; Samorodnitsky, Gennady; Soulier, Philippe
We study the asymptotic behavior of empirical processes generated by
measurable bounded functions of an infinite source Poisson transmission
process when the session length have infinite variance. In spite of the
boundedness of the function, the normalized fluctuations of such an empirical
process converge to a non-Gaussian stable process. This phenomenon can be
viewed as caused by the long-range dependence in the transmission process.
Completing previous results on the empirical mean of similar types of
processes, our results on non-linear bounded functions exhibit the influence
of the limit transmission rate distribution at high session lengths on the
asymptotic behavior of the empirical process. As an illustration, we apply
the main result to estimation of the distribution function of the steady
state value of the transmission process.
2010-04-08T13:18:37ZLong strange segments, ruin probabilities and the effect of memory on moving average processes
http://hdl.handle.net/1813/14476
Long strange segments, ruin probabilities and the effect of memory on moving average processes
Ghosh, Souvik
We obtain the rate of growth of long strange segments and the rate of
decay of
infinite horizon ruin probabilities for a class of infinite moving
average processes with exponentially light tails. The rates are
computed explicitly. We show that the rates are very similar to those
of an i.i.d. process as long as moving average coefficients decay fast
enough. If they
do not, then the rates are significantly different. This demonstrates
the change in the length of memory in a moving average process
associated with certain changes in the rate of decay of the
coefficients.
2010-02-15T17:12:36ZHow Fast Can the Chord-Length Distribution Decay?
http://hdl.handle.net/1813/13700
How Fast Can the Chord-Length Distribution Decay?
Demichel, Yann; Estrade, Anne; Kratz, Marie; Samorodnitsky, Gennady
The modelling of random bi-phasic, or porous, media has been, and still is,
under active investigation by mathematicians, physicists or physicians. In this
paper we consider a thresholded random process X as a source of the two
phases. The intervals when X is in a given phase, named chords, are the
subject of interest. We focus on the study of the tails of the chord-length
distribution functions. In the literature, different types of the tail behavior
have been reported, among them exponential-like or power-like decay. We look
for the link between the dependence structure of the underlying thresholded
process X and the rate of decay of the chord-length distribution. When the
process X is a stationary Gaussian process, we relate the latter to the rate
at which the covariance function of $X$ decays at large lags. We show that
exponential, or nearly exponential, decay of the tail of the distribution of
the chord-lengths is very common, perhaps surprisingly so.
2009-09-21T20:18:12ZEvaluating Planned Capacities for Public Health Emergency Supply Chain Models
http://hdl.handle.net/1813/13686
Evaluating Planned Capacities for Public Health Emergency Supply Chain Models
King, Kathleen; Muckstadt, Jack
To respond effectively in the event of a public health emergency, careful analysis of preparedness plans is essential. This note describes a linear programming model of the Center for Disease Control Strategic National Stockpile (SNS) supply chain for distributing medical supplies under
emergency conditions. The purpose of the model is to evaluate the effects that a specified collection of planned resource availabilities would have on the efficacy of the distribution network. This efficacy is measured in terms of the number of patient -hours waited during the emergency response
time horizon. The model's constraints describe the physical system, including the relationships between the SNS, the regional distribution warehouse, and the medical supply distribution clinics. In the following note all model constraints and variables are described in detail, and AMPL code implementing the model is given.
2009-09-16T17:50:17ZHigh level excursion set geometry for non-Gaussian infinitely divisible random fields
http://hdl.handle.net/1813/13345
High level excursion set geometry for non-Gaussian infinitely divisible random fields
Adler, Robert; Samorodnitsky, Gennady; Taylor, Jonathan
We consider smooth, infinitely divisible random fields with regularly varying
Levy measure, and are interested in the
geometric characteristics of the excursion sets over high levels u.
For a large class of such random fields we compute the asymptotic
joint distribution of the numbers of critical points, of various types,
of the random field in the excursion set, conditional on the latter being non-empty.
This allows us, for example, to obtain the
asymptotic conditional distribution of the Euler characteristic of the
excursion set.
In a significant departure from the Gaussian situation,
the high level excursion sets for these random fields can have quite a
complicated geometry. Whereas in the Gaussian
case non-empty excursion sets are, with high probability, roughly
ellipsoidal, in the more general infinitely divisible setting almost
any shape is possible.
2009-08-04T19:45:03ZPrediction of outstanding payments in a Poisson cluster model
http://hdl.handle.net/1813/13344
Prediction of outstanding payments in a Poisson cluster model
Jessen, Anders Hedegaard; Mikosch, Thomas; Samorodnitsky, Gennady
We consider a simple Poisson cluster model for the payment numbers and the
corresponding total payments for insurance claims arriving in a given
year. Due to the Poisson structure one can
give reasonably explicit expressions for the prediction of the payment
numbers and total payments in future periods given the
past observations of the payment numbers. One can also derive
reasonably explicit expressions for the corresponding
prediction errors. In the (a,b)-class of Panjer's claim size distributions,
these expressions can be evaluated by simple
recursive algorithms. We study the conditions under which the
predictions are asymptotically linear as the number
of past payments becomes large. We also demonstrate that, in other
regimes, the prediction may be far from linear. For example, a
staircase-like pattern may arise as well.
We illustrate how the theory works
on real-life data, also in comparison with the chain ladder method.
2009-08-04T19:08:37Z