Block Maxima Approach
Block Maxima Approach
One approach to working with extreme value data is to group the data into blocks of equal length
and fit the data to the maximums of each block, for example, annual maxima of daily precipitation amounts. The
choice of block size can be critical as blocks that are too small can lead to bias and blocks that are too large
generate too few block maxima, which leads to large estimation variance (see
Coles (2001) (b) Ch. 3). The block
maxima
approach is closely associated with the use of the GEV family. Note that all parameters are always estimated
(with extRemes)
by maximum likelihood estimation (MLE), which requires iterative numerical optimization techniques. See
Coles (2001) (b) section 2.6 on parametric modeling for more information on this
optimization method.
Fitting data to a GEV distribution
The general procedure for fitting data to a GEV distribution with extRemes is
- Analyze
Generalized Extreme Value (GEV) Distribution
New window appears.
- Select data object from Data Object listbox
column names appear in other listboxes.
- Choose a response variable from the Response listbox
Response variable is removed as an option from other listboxes.
- Select other options as desired
OK
- A GEV distribution will be fitted to the chosen response variable and stored in the same list object
as the data used.
Example 1: Port Jervis data
This example uses the PORT dataset (see
Example 2: Loading an R source Dataset) to illustrate fitting data to a GEV
using extRemes. If you have not already loaded these data,
please do so before trying this example. Figure 2.1 shows a time series of the annual (winter)
maximum temperatures (degrees centigrade).

Figure 2.1: Time series of Port Jervis annual (winter) maximum temperature (degrees centigrade).
- Analyze
Generalized Extreme Value (GEV) Distribution
New window appears
- Select PORT from Data Object listbox.
Column names appear in other listboxes.
- Choose TMX1 from the Response listbox ( Note that TMX1
is removed as an option from other listboxes).
- Click on the Plot diagnostics checkbutton
OK.
- Here, we ignore the rest of the fields because we are not yet incorporating any covariates into the the fit.
An R graphics window appears displaying the probability and quantile plots, a return-level plot, and a density estimate plot
as shown in Figure 2.2. In the case of perfect fit, the data
would line up on the diagonal of the probability and quantile plots.
Briefly, the quantile plot compares the model quantiles against the data (empirical) quantiles. A quantile plot that
deviates greatly from a straight line suggests that the model assumptions may be invalid for the data plotted. The return
level plot shows the return period against the return level, and shows an estimated 95\% confidence interval. The return
level is the level (in this case temperature) that is expected to be exceeded, on average, once every m time points
(in this case years). The return period is the amount of time expected to wait for the exceedance of a particular return
level. For example, in Figure 2.2, one would expect the maximum winter temperature for Port
Jervis to exceed about 24 degrees centigrade on average every 100 years. Refer to
Coles (2001) (b) Ch. 3 for more details about these plots.

Figure 2.2: GEV fit diagnostics for Port Jervis winter maximum temperature dataset. Quantile and return level
plots are in degrees centigrade.
In the status section of the main window, several details of the fit are displayed. The maximum likelihood estimates
of each of the parameters are given, along with their respective standard errors. In this case,
15.14 degrees centigrade (0.39745 degrees),
2.97 degrees (0.27523 degrees) and
-0.22 (0.0744). The negative log-likelihood for the model (172.7426) is also displayed.
Note that Figure 2.2 can be re-made in the following manner.
- Plot
Fit diagnostics
- Select PORT from the Data Object listbox.
- Select gev.fit1 from the Select a fit listbox
OK
GEV is fit and plot diagnostics displayed.
It may be of interest to incorporate a covariate into one or more of the parameters of the GEV. For example,
the dominant mode of large-scale variability in mid-latitude Northern Hemisphere temperature variability is the
North Atlantic Oscillation-Arctic Oscillation (NAO-AO). Such a relationship should be investigated by
including these indices as a covariate in the GEV. See
Fitting data to a GEV distribution with a covariate for inclusion of one of
these variables as a covariate.
Back to Top
Return level and shape parameter (
) (1-
)% confidence limits
Confidence intervals may be estimated using the toolkit for either the m-year return level or shape parameter
(
) of either the GEV distribution or the GPD.
The estimates are based on the profile likelihood method; finding the intersection between the respective profile likelihood
values and 
, where
is the distance
between the maximum of the profile log-likelihood and the
quantile of a
distribution
(see Coles (2001) (b) section 2.6.5 for more information). The general procedure for
estimating confidence limits for return levels and shape parameters of the GEV distribution using extRemes is as
follows.
- Analyze
Parameter Confidence Intervals
GEV fit
- Select an object from the Data Object listbox.
- Select a fit from the Select a fit listbox.
- Enter search limits for both return level and shape parameter (xi) (and any other options)
OK
Example: Port Jervis Data Continued
MLE estimate for 100-year return levels in the above GEV fit for the Port Jervis data are found to be somewhere
between 20 and 25 degrees (using the return level plot), and
-0.2 (
0.07). These values can be used in finding a
reasonable search range for estimating the confidence limits. In the case of the return level one range that finds
correct5 confidence limits is from 22 to 28, and similarly, for the shape
parameter, from -0.4 to 0.1. To find confidence limits, do the following.
5
- Analyze
Parameter Confidence Intervals
GEV fit
- Select PORT from the Data Object listbox.
- Select gev.fit1 from the Select a fit listbox.
- Enter 22 in the Lower limit of the Return Level Search Range and
28 in the Upper limit field.5
- Enter -0.4 in the Lower limit of the Shape Parameter (xi) Search Range
and 0.1 in the Upper limit field
OK.
5
Estimated confidence limits should now appear in the main toolkit dialog. In this case, the estimates are given
to be about 22.42 to 27.18 degrees for the 100-year return level and about -0.35 to -0.05 for
indicating that this parameter is significantly below zero (i.e., Weibull type). Of course, it is also possible to find
limits for other return levels (besides 100-year) by changing this value in the m-year return level field. Also,
the profile likelihoods (Figure 2.3) can be produced by clicking on the check checkbutton for
this feature. In this case, our estimates are good because the dashed vertical lines intersect the likelihood at the same
point as the lower horizontal line in both cases.

Figure 2.3: Profile likelihood plots for the 100-year return level (degrees centigrade) and shape parameter
(
) of the GEV distribution fit to the Port Jervis dataset.
Back to Top
Fitting data to a GEV distribution with a covariate
The general procedure for fitting data to a GEV distribution with a covariate is similar to that of fitting data to a GEV
without a covariate, but with two additional steps. The procedure is:
- Analyze
Generalized Extreme Value (GEV) Distribution
New window appears
- Select data object from Data Object listbox. Column names appear in other listboxes.
- Choose a response variable from the Response
listbox. Response variable is removed as an option from other listboxes.
- Select covariate variable(s) from Location parameter (mu), Scale parameter (sigma)
and/or Shape parameter (xi) listboxes
- select which link function to use for each of these choices
OK
- A GEV distribution will be fitted to the chosen response variable and stored in the same list object as the
data used.
Example 2: Port Jervis data with a covariate
To demonstrate the ability of the Toolkit to use covariates, we shall continue with the Port Jervis data and fit
a GEV on TMX1, but with the Atlantic Oscillation index, AOindex, as a covariate with a linear link
to the location parameter. See Wettstein and Mearns for more information on
this index.
Analyze
Generalized Extreme Value (GEV) Distribution.
- Select PORT from Data Object
listbox. Variables now listed in some other listboxes.
- Select TMX1 from the Response listbox. TMX1
removed from other listboxes.
- Optionally check the Plot diagnostics checkbox
- Select AOindex from Location parameter (mu) list ( keep Link as
identity)
OK
- A GEV fit on the Port Jervis data is performed with AOindex as a covariate in the location parameter.
The status window now displays information similar to the previous example, with one important
exception. Underneath the estimate for MU (now the intercept) is the estimate for the covariate
trend in mu as modeled by AOindex. In this case,
15.25 + 1.15(AOindex)
Figure 2.4 shows the diagnostic plots for this fit. Note that only the probability and
quantile plots are displayed and that the quantile plot is in the Gumbel scale. See the
appendix for more details.

Figure 2.4: GEV fit diagnostics for Port Jervis winter maximum temperature dataset with AOindex as a
covariate. Both plots are generated using transformed variables and therefore the units
are not readily interpretable. See appendix for more details.
A test can be performed to determine if this model with AOindex as a covariate is an improvement over the previous fit
without a covariate. Specifically, the test compares the likelihood-ratio,
,
where
and
are the likelihoods for each of the two models
(
must be nested in
), to a
quantile, where
is the difference in the number of estimated parameters. In this case, we have three
parameters estimated for the example without a covariate and four parameters for the case with a covariate because
=
+
(AOindex)
giving us the new parameters:
,
,
and
. So,
for this example,
=4-3=1.
See Coles (2001) (b) section 6.2 for details on this test. Note that the model
without a covariate was stored as gev.fit1 and the model with a covariate was stored as gev.fit2; each time a
GEV is fit using this data object, it will be stored as gev.fitN, where N is the N-th fit performed.
The general procedure is:
- Analyze
Likelihood-ratio test
New window appears.
- Select a data object. In this case, PORT from the Data Object
listbox. Values are filled into other listboxes.
- Select fits to compare. In this case, gev.fit1 from Select base fit (M0)
6
listbox and gev.fit2 from Select comparison fit (M1)6 listbox
OK.6{If fit from M0 has more components than that of M1, extRemes will
assume M1 is nested in M0, and computes the likelihood-ratio accordingly.}
- Test is performed and results displayed in main toolkit window.
For this example, the likelihood-ratio is about 11.89, which is greater than the 95\% quantile of the
distribution of 3.8415, suggesting that the covariate AOindex model is a
significant improvement over the model without a covariate. The small p-value of 0.000565 further supports this claim.
In addition to specifying the covariate for a given parameter, the user has the ability to indicate what type of link
function should relate that covariate to the parameter. The two available link functions ( identity and log) are
indicated by the radiobuttons to the right of the covariate list boxes. This example used the identity link function
(note that the log link is labeled exponential in Stuart Coles' software ( ismev)). For example, to
model the scale parameter (
) with the log-link and one covariate, say x, gives
= exp(
+
x), or
log
=
+
x.
5If the Lower limit (or Upper limit) field(s) is/are left blank, extRemes will make a
reasonable guess for these values. Always check the Plot profile likelihoods checkbutton, and inspect the
plots when finding limits automatically in order to ensure that the confidence intervals are correct or not. If they
do not appear to be correct (i.e., if the dashed vertical line(s) does/do not intersect the profile likelihood at
about where the lower horizontal line intersects the profile likelihood), the resulting plot might suggest appropriate
limits to input manually.
6 If fit from M0 has more components than that of M1, extRemes will assume
M1 is nested in M0, and computes the likelihood-ratio accordingly.
Back to Top
Back to Table of Contents