ORIE Technical Reports
Permanent URI for this collection
This is a collection of Technical Reports and Papers written for the field of Operations Research and Information Engineering.
Browse
Recent Submissions
Item Handling missing extremes in tail estimationXu, Hui; Davis, Richard; Samorodnitsky, Gennady (2020)In some data sets, it may be the case that a portion of the extreme observations are missing. This might arise in cases where the extreme observations are just not available or are imprecisely measured. For example, considering human lifetimes, a topic of recent interest, birth certificates of centenarians may not even exist and many such individuals may not even be included in the data sets that are currently available. In essence, one does not have a clear record of the largest lifetimes of human populations. If there are missing extreme observations, then the assessment of risk can be severely underestimated resulting in rare events occurring more often than originally thought. In concrete terms, this may mean a 500 year flood is in fact a 100 (or even a 20) year flood. In this paper, we present methods for estimating the number of missing extremes together with the tail index associated with tail heaviness of the data. Ignoring one or the other can severely impact the estimation of risk. Our estimates are based on the HEWE (Hill estimate without extremes) of the tail index that adjusts for missing extremes. Based on a functional convergence of this process to a limit process, we consider an asymptotic likelihood-based procedure for estimating both the number of missing extremes and the tail index. We derive the asymptotic distribution of the resulting estimates. By artificially removing segments of extremes in the data, this methodology can be used for assessing the reliability of the underlying assumptions that are imposed on the data.Item Extremal clustering under moderate long range dependence and moderately heavy tailsChen, Zaoli; Samorodnitsky, Gennady (2020)We study clustering of the extremes in a stationary sequence with subexponential tails in the maximum domain of attraction of the Gumbel We obtain functional limit theorems in the space of random sup- measures and in the space D(0, ∞). The limits have the Gumbel distribu- tion if the memory is only moderately long. However, as our results demon- strate rather strikingly, the “heuristic of a single big jump” could fail even in a moderately long range dependence setting. As the tails become lighter, the extremal behavior of a stationary process may depend on multiple large values of the driving noise.Item High minima of non-smooth Gaussian processesWu, Zhixin; Chakrabarty, Arijit; Samorodnitsky, Gennady (2019-02-27)In this short note we study the asymptotic behaviour of the minima over compact intervals of Gaussian processes, whose paths are not necessarily smooth. We show that, beyond the logarithmic large deviation Gaussian estimates, this problem is closely related to the classical small-ball problem. Under certain conditions we estimate the term describing the correction to the large deviation behaviour. In addition, the asymptotic distribution of the location of the minimum, conditionally on the minimum exceeding a high threshold, is also studied.Item Extreme Value Theory for Long Range Dependent Stable Random FieldsChen, Zaoli; Samorodnitsky, Gennady (2018-10-15)We study the extremes for a class of a symmetric stable random fields with long range dependence. We prove functional extremal theorems both in the space of sup measures and in the space of cadlag functions of several variables. The limits in both types of theorems are of a new kind, and only in a certain range of parameters these limits have the Fr\'echet distribution.Item Modelling and Inference for Extremal EventsSun, Julian (2018-08-11)Extreme events are frequently observed in nature and in human activities; they tend to have severe and often negative impact. For this reason they are wellstudied, and the underlying body of work is usually referred to as extreme value theory. The theory often deals with the behavior in the tail of probability distributions or data sets. A key notation is that of heavy-tailed probability distributions. Univariate heavy-tailed distributions exhibit interesting mathematical properties practical for modelling purposes. However, many types of univariate heavy-tailed distributions do not have natural multivariate extensions. Another area of interest in extreme value theory is that of the clustering of extremes in stationary sequences. Inference of cluster sizes tends to be difficult, partly due to the scarcity of data. Clustering also introduces heavy serial dependence in extremal observations, which in turn influences statistical analysis. This thesis seeks to address the aforementioned problems and difficulties. Key contributions include: a multivariate model for a particular class of heavy-tailed distributions, the subexponential distributions, that allows for the approximation of ruin probabilities; a multilevel approach to extremal inference that partially addresses the issue of data scarcity and that improves the variance of extremal estimators; and an algorithmic method to reduce of serial dependence in extremal inferenceItem Regularly Varying Random FieldsWu, Lifan; Samorodnitsky, Gennady (2018-09-05)We study the extremes of multivariate regularly varying random fields. The crucial tools in our study are the tail field and the spectral field, notions that extend the tail and spectral processes of Basrak and Segers (2009). The spatial context requires multiple notions of extremal index, and the tail and spectral fields are applied to clarify these notions and other aspects of extremal clusters. An important application of the techniques we develop is to the Brown-Resnick random fields.Item Distance covariance for discretized stochastic processesDehling, Harold; Matsui, Muneya; Mikosch, Thomas; Samorodnitsky, Gennady; Tafakori, Laleh (2018-06-26)Given an iid sequence of pairs of stochastic processes on the unit interval we construct a measure of independence for the components of the pairs. We define distance covariance and distance correlation based on approximations of the component processes at finitely many discretization points. Assuming that the mesh of the discretization converges to zero as a suitable function of the sample size, we show that the sample distance covariance and correlation converge to limits which are zero if and only if the component processes are independent. To construct a test for independence of the discretized component processes we show consistency of the bootstrap for the corresponding sample distance covariance/correlation.Item From infinite urn schemes to self-similar stable processesDurieu, Olivier; Samorodnitsky, Gennady; Wang, Yizao (2017)We investigate the randomized Karlin model with parameter beta in (0,1), which is based on an infinite urn scheme. It has been shown before that when the randomization is bounded, the so-called odd-occupancy process scales to a fractional Brownian motion with Hurst index beta/2 in (0,1/2). We show here that when the randomization is heavy-tailed with index alpha in (0,2), then the odd-occupancy process scales to a (beta/alpha)-self-similar symmetric alpha-stable process with stationary increments.Item Extreme value analysis without the largest values: what can be done?Zou, Jingjing; Davis, Richard; Samorodnitsky, Gennady (2017)In this paper we are concerned with the analysis of heavy-tailed data when a portion of the extreme values are unavailable. This research was motivated by an analysis of the degree distributions in a large social network. The degree distributions of such networks tend to have power law behavior in the tails. We focus on the Hill estimator, which plays a starring role in heavy-tailed modeling. The Hill estimator for this data exhibited a smooth and increasing “sample path” as a function of the number of upper order statistics used in constructing the estimator. This behavior became more apparent as we artificially removed more of the upper order statistics. Building on this observation, we introduce a new parameterization into the Hill estimator that corresponds to the proportion of extreme values that are unavailable and the proportion of upper order statistics used in the estimation. We establish functional convergence of the normalized Hill estimator to a Gaussian random field. An estimation procedure is developed based on the limit theory to estimate the number of missing extremes and extreme value parameters including the tail index and the bias of Hill’s estimate. We illustrate how this approach works in both simulations and real data examples.Item Extremal theory for long range dependent infinitely divisible processesSamorodnitskty, Gennady; Wang, Yizao (2017-03)We prove limit theorems of an entirely new type for certain long memory regularly varying stationary \id\ random processes. These theorems involve multiple phase transitions governed by how long the memory is. Apart from one regime, our results exhibit limits that are not among the classical extreme value distributions. Restricted to the one-dimensional case, the distributions we obtain interpolate, in the appropriate parameter range, the $\alpha$-Fr\'echet distribution and the skewed $\alpha$-stable distribution. In general, the limit is a new family of stationary and self-similar random sup-measures with parameters $\alpha\in(0,\infty)$ and $\beta\in(0,1)$, with representations based on intersections of independent $\beta$-stable regenerative sets. The tail of the limit random sup-measure on each interval with finite positive length is regularly varying with index $-\alpha$. The intriguing structure of these random sup-measures is due to intersections of independent $\beta$-stable regenerative sets and the fact that the number of such sets intersecting simultaneously increases to infinity as $\beta$ increases to one. The results in this paper extend substantially previous investigations where only $\alpha\in(0,2)$ and $\beta\in(0,1/2)$ have been considered.