Extreme value analysis without the largest values: what can be done?
Zou, Jingjing; Davis, Richard; Samorodnitsky, Gennady
In this paper we are concerned with the analysis of heavy-tailed data when a portion of the extreme values are unavailable. This research was motivated by an analysis of the degree distributions in a large social network. The degree distributions of such networks tend to have power law behavior in the tails. We focus on the Hill estimator, which plays a starring role in heavy-tailed modeling. The Hill estimator for this data exhibited a smooth and increasing “sample path” as a function of the number of upper order statistics used in constructing the estimator. This behavior became more apparent as we artificially removed more of the upper order statistics. Building on this observation, we introduce a new parameterization into the Hill estimator that corresponds to the proportion of extreme values that are unavailable and the proportion of upper order statistics used in the estimation. We establish functional convergence of the normalized Hill estimator to a Gaussian random field. An estimation procedure is developed based on the limit theory to estimate the number of missing extremes and extreme value parameters including the tail index and the bias of Hill’s estimate. We illustrate how this approach works in both simulations and real data examples.
This research is funded by ARO MURI grant W911NF-12-1-0385.
Hill estimator; Heavy tails; Missing extremes; Functional convergence