Handling missing extremes in tail estimation
Xu, Hui; Davis, Richard; Samorodnitsky, Gennady
In some data sets, it may be the case that a portion of the extreme observations are missing. This might arise in cases where the extreme observations are just not available or are imprecisely measured. For example, considering human lifetimes, a topic of recent interest, birth certificates of centenarians may not even exist and many such individuals may not even be included in the data sets that are currently available. In essence, one does not have a clear record of the largest lifetimes of human populations. If there are missing extreme observations, then the assessment of risk can be severely underestimated resulting in rare events occurring more often than originally thought. In concrete terms, this may mean a 500 year flood is in fact a 100 (or even a 20) year flood. In this paper, we present methods for estimating the number of missing extremes together with the tail index associated with tail heaviness of the data. Ignoring one or the other can severely impact the estimation of risk. Our estimates are based on the HEWE (Hill estimate without extremes) of the tail index that adjusts for missing extremes. Based on a functional convergence of this process to a limit process, we consider an asymptotic likelihood-based procedure for estimating both the number of missing extremes and the tail index. We derive the asymptotic distribution of the resulting estimates. By artificially removing segments of extremes in the data, this methodology can be used for assessing the reliability of the underlying assumptions that are imposed on the data.
ARO grant W911NF-18 -10318
heavy tails; regular variation; missing extremes; tail estimation