accuracy() computes model-based BLUP accuracy for each environment (or group) in a multi-environment trial (MET) fitted with ASReml-R V4. Two complementary metrics are available:
Mrode accuracy — per-variety accuracy derived from the prediction error variance (PEV) and the genetic variance.
Generalised H² — a reliability analogue based on the mean squared standard error of difference (SED) between BLUPs.
Both metrics are bounded between 0 and 1, with values closer to 1 indicating better prediction. The function handles all interaction combinations of major ASReml-R V4 random structures for the Site term: fa(), diag(), corgh(), corh(), us(), as well as structures for the variety term id(),ide(),vm() (see below for more info).
plot_accuracy() visualises these results with six complementary plot types covering single-model summaries and head-to-head model comparisons.
Mathematical Framework
Genetic variance and BLUP prediction error
Let \hat{u}_{ij} be the BLUP for variety i in environment j, with prediction error variance
The genetic (co)variance for environment j is denoted G_{jj} — the j-th diagonal of the \mathbf{G} (or the corresponding variance component for simpler structures such as diag()).
Mrode accuracy
Following Mrode (2014), the accuracy for variety i in environment j is
which is the square root of the reliability — the proportion of true genetic variance that is recovered by the BLUP. The mean accuracy for environment j is
\bar{r}_j = \frac{1}{n_j}\sum_{i=1}^{n_j} r_{ij},
where n_j is the number of varieties with observed data in environment j.
accuracy() invokes the argument only = in predict.asreml() to ensure the correct \operatorname{PEV}_{ij} are returned for the target random term supplied.
Generalised H²
Cullis et al. (2006) proposed a heritability analogue based on the mean pairwise squared standard error of difference (SED):
where \overline{\text{SED}}^2_j is the mean of the upper-triangular elements of the SED matrix for environment j. Unlike Mrode accuracy, H^2_j captures the precision of variety comparisons rather than individual predictions, and is equivalent to heritability on a variety-mean basis.
ASReml-R term specifications
The term argument of accuracy() takes the full interaction string from the model’s random = formula (auto-detected when NULL) that is used to specify the Site \times Variety or Site \times Treatment \times Variety interaction.
For ASReml-R multi-site models that used corgh() and corh(), accuracy() will extract the estimated matrix and convert it to a fully unstructured variance matrix \mathbf{G} before accuracy or generalised heritability calculations are undertaken. For single-environment models users can omit the site components and supply the variety term alone, e.g. "id(Variety)" or "vm(Variety, K)".
Example term strings
# Multi-environmentaccuracy(model, term ="fa(Site, 2):id(Variety)")accuracy(model, term ="fa(Site, 1):vm(Variety, K)")accuracy(model, term ="diag(Site):id(Variety)")accuracy(model, term ="corgh(Site):id(Variety)")accuracy(model, term ="us(Site):vm(Variety, K)")# Single-environmentaccuracy(model, term ="id(Variety)")accuracy(model, term ="vm(Variety, K)")
Setup
Simulated data
Requires: ASReml-R V4. The vignette is pre-rendered so no licence is needed to read it; run the code chunks only if you have the package installed.
We use simTrialData() to generate a MET with 50 varieties across 8 environments, 2 replicates, and an "unbalanced" incidence structure. The auto-generated genetic covariance matrix \mathbf{G} ensures environments share genuine genetic covariance (pairwise correlations in the default range [0.20, 0.90]), and the unbalanced incidence ensures that varieties appear in different subsets of environments and this produces meaningful variety-to-variety variation in prediction error variance that makes the per-variety accuracy plots informative.
out <-simTrialData(nvar = 50L,nsite = 8L,nrep = 2L,incidence ="unbalanced",seed = 11L,verbose =FALSE)dat <- out$datadat$Row <-factor(dat$Row)dat$Column <-factor(dat$Column)dat$Site <-factor(dat$Site)dat$Variety <-factor(dat$Variety)# Connectivity summarycat("Varieties per site: mean =", round(mean(colSums(out$params$incidence)), 1)," range [", min(colSums(out$params$incidence)), ",",max(colSums(out$params$incidence)), "]\n")
Varieties per site: mean = 32.2 range [ 28 , 37 ]
cat("Sites per variety: mean =", round(mean(rowSums(out$params$incidence)), 1)," range [", min(rowSums(out$params$incidence)), ",",max(rowSums(out$params$incidence)), "]\n")
We fit three random structures for the Site × Variety genetic term, in increasing order of complexity:
Model
Variety random structure
Genetic parameters
Diag
diag(Site):id(Variety)
6 (one \sigma^2_j per site)
FA(1)
fa(Site, 1):id(Variety)
12 (6 loadings + 6 specific variances)
FA(2)
fa(Site, 2):id(Variety)
18 (12 loadings + 6 specific variances)
All three models include at(Site):Rep for replicate effects and a common homogeneous residual. The diag model assumes no genetic correlation between sites — varieties are re-ranked completely independently across environments. The FA models borrow information across environments: FA(1) captures one dominant axis of Site \times Variety interaction; FA(2) adds a second, allowing richer rank-change patterns of the varieties across sites.
A quick comparison of log-likelihoods confirms the FA models provide a better fit to the data. AIC is computed as -2 \ell + 2p where p is the number of variance parameters:
The G_jj column reports the environment-specific genetic variance used as the denominator. mean_acc is \bar{r}_j and sd_acc captures the spread of per-variety accuracies within the environment. gen.H2 is H^2_j.
Requesting a single metric
Pass a single string to metric when you only need one measure:
# Mrode accuracy onlyaccuracy(m_fa2, term ="fa(Site, 2):id(Variety)", metric ="accuracy")
Set by_variety = TRUE to return one row per variety × environment. All combinations with a finite prediction standard error are returned, including variety × site combinations that were not directly observed in the data (for FA models the G-matrix supports prediction at unobserved combinations). The present column indicates which combinations were observed in the data (TRUE) and which are unobserved (FALSE).
acc_fa2_bv <-accuracy(m_fa2, term ="fa(Site, 2):id(Variety)",metric =c("accuracy", "gen.H2"),by_variety =TRUE)acc_diag_bv <-accuracy(m_diag, term ="diag(Site):id(Variety)",metric =c("accuracy", "gen.H2"),by_variety =TRUE)head(acc_fa2_bv)
The present column flags which variety × site combinations were directly observed in the trial data. For an unbalanced trial, some combinations will be FALSE where varieties are not grown at that site but still receive a BLUP prediction. in most instances they will have a a higher (worse) PEV.
# How many combinations are observed vs unobserved?table(Observed = acc_fa2_bv$present)
Observed
FALSE TRUE
142 258
Visualising with plot_accuracy()
Lollipop — per-site accuracy summary
The lollipop plot is the go-to summary for group-level results. Each site appears as a row, with the lollipop head at the mean accuracy. Horizontal error bars span ±1 standard deviation of the per-variety accuracies (Mrode metric only). When both metrics are requested the plot facets into two panels.
plot_accuracy(acc_fa2, type ="lollipop",metric =c("accuracy", "gen.H2"))
A single-metric call gives a cleaner, unfaceted plot:
plot_accuracy(acc_fa2, type ="lollipop", metric ="accuracy")
Note that Cullis H² is systematically lower than Mrode accuracy — H^2 measures comparison precision while r measures individual prediction reliability, and the two capture subtly different aspects of model quality.
Lollipop — observed vs unobserved accuracy
Passing by_variety = TRUE data to type = "lollipop" automatically triggers a dumbbell layout when both observed and unobserved combinations are present. The blue dot shows the mean accuracy across varieties directly observed at each site; the orange dot shows the mean accuracy for unobserved varieties. The dashed segment connecting the two makes the size of the accuracy gap visible at a glance.
plot_accuracy(acc_fa2_bv, type ="lollipop", metric ="accuracy")
Sites where the two dots sit close together have strong inter-site genetic correlations — the FA G-matrix can borrow enough information from other sites to predict unobserved varieties almost as well as observed ones. Sites with a wider gap have weaker covariance structure and provide less support for unobserved predictions.
Violin — per-variety accuracy distribution
The violin plot requires by_variety = TRUE data and displays the Mrode Accuracy metric only. Cullis H² is a group-level constant and is suppressed automatically — use type = "lollipop" to visualise it. The width of each violin reflects the spread of per-variety accuracies within each environment. Individual points are coloured by present status: blue for varieties directly observed at that site, orange for varieties that were unobserved.
plot_accuracy(acc_fa2_bv, type ="violin", metric ="accuracy")
The orange points cluster at the lower end of each violin, confirming that unobserved varieties have higher PEV and lower accuracy. “Core” varieties observed at most sites appear as blue points at the high-accuracy end; rarely-observed varieties as blue points lower down; and the orange unobserved predictions occupy the tail.
Heatmap — variety × site accuracy grid
The heatmap visualises per-variety accuracy across all environments simultaneously. The full variety × site grid is displayed — observed combinations (present = TRUE) and unobserved combinations (present = FALSE) alike. Varieties are sorted from highest mean accuracy (top) to lowest (bottom); environments are ordered by increasing mean accuracy. The viridis colour scale maps accuracy onto a perceptually uniform gradient. A small grey circle marks each tile where the variety was not observed at that site (present = FALSE); the underlying colour still shows the model-based accuracy for that combination.
plot_accuracy(acc_fa2_bv, type ="heatmap", metric ="accuracy")
Tiles with a grey circle are systematically darker than their circled-free neighbours in the same column, reflecting the higher PEV of unobserved predictions. A column that is uniformly darker than its neighbours indicates an environment where all varieties are predicted less reliably, regardless of observed status.
Accuracy gap — unobserved vs observed
Passing by_variety = TRUE data to type = "diff"without a second model computes the accuracy gap per site: the signed difference between the mean accuracy of unobserved varieties and the mean accuracy of observed varieties (Unobserved − Observed). All sites are expected to show a negative gap — unobserved varieties have higher PEV, so their accuracy is lower.
plot_accuracy(acc_fa2_bv, type ="diff", metric ="accuracy")
Sites with the smallest gap (least negative) have the strongest inter-site genetic correlations. The connecticity in the FA modelling approach can borrow enough information from neighbouring sites to support near-observed-quality predictions. Sites with a larger gap are more informationally isolated: varieties not grown there receive much less support from the rest of the trial.
Comparing models using the res2 argument
The three comparison plot types take a second accuracy object via res2 and directly show the change in accuracy between models. This is where the story of moving from diag to FA models becomes clearly visible.
Dumbbell — paired group-level comparison
The dumbbell plot places two dots per site connected by a horizontal segment. The segment colour encodes the direction of change: green when the second model improves accuracy, red when it declines.
The improvement in FA(2) over diag tends to be larger for environments with stronger GEI — those are the environments where borrowing information via the factor structure is most beneficial.
Scatter — per-variety comparison
The scatter plot places each variety × environment combination as a point, with model1 accuracy on the x-axis and model2 accuracy on the y-axis. Points above the dashed diagonal (x = y) represent variety × environment combinations where model2 is more accurate. Points are coloured by environment.
The scatter plot reveals whether the accuracy gain is uniform across the genetic material or concentrated in particular varieties — for instance, rare or poorly-replicated varieties often benefit most from the additional shrinkage imposed by the FA structure.
Diff — signed accuracy gain between models
The diff plot shows (model2 − model1) for each environment as a lollipop centred on zero. It distils the dumbbell into a single signed quantity, making it easy to rank environments by how much they benefit from the more complex model. The x-axis is always symmetric around zero.
All six plot types support return_data = TRUE, returning a named list with $plot and $data. This is useful for further customisation or downstream analysis.
The ggplot object is fully extensible with standard ggplot2 additions:
rd$plot + ggplot2::ggtitle("Accuracy gain: FA(2) over Diag",subtitle ="Positive values indicate FA(2) is more accurate" )
Summary
accuracy() arguments
Argument
Default
Description
model
—
Fitted ASReml-R V4 model object
term
Auto-detected
Full random-effect interaction term, e.g. "fa(Site, 2):id(Variety)" or "corgh(Site):vm(Variety, giv1)"; auto-detected from the model formula when NULL
metric
c("accuracy","gen.H2")
Metrics to compute; one or both
pworkspace
"2gb"
Passed to predict.asreml()
by_variety
FALSE
Return one row per variety × group; includes a present column (TRUE = observed in data, FALSE = unobserved)
plot_accuracy() types
Type
Input
Best for
"lollipop"
Group-level orby_variety
Accuracy summary per site; dumbbell of observed vs unobserved when present split present
"violin"
by_variety
Distribution of accuracy; points coloured blue (observed) / orange (unobserved)
"heatmap"
by_variety
Variety × site accuracy grid; grey circle marks unobserved tiles
"dumbbell"
Group-level + res2
Paired model comparison per site
"scatter"
by_variety + res2
Per-variety model comparison
"diff"
Group-level + res2orby_variety alone
Signed model accuracy gain; or unobserved vs observed accuracy gap when no res2
Faceting into two panels is automatic when metric = c("accuracy","gen.H2").
References
Cullis, B.R., Smith, A.B. & Coombes, N.E. (2006). On the design of early generation variety trials with correlated data. Journal of Agricultural, Biological and Environmental Statistics, 11(4), 381–393. https://doi.org/10.1198/108571106X154443
Mrode, R.A. (2014). Linear Models for the Prediction of Animal Breeding Values, 3rd edition. CAB International, Wallingford.
Smith, A., Cullis, B. & Thompson, R. (2001). Analysing variety by environment data using multiplicative mixed models and adjustments for spatial field trend. Biometrics, 57(4), 1138–1147. https://doi.org/10.1111/j.0006-341X.2001.01138.x
Gilmour, A.R., Gogel, B.J., Cullis, B.R. & Thompson, R. (2009). ASReml User Guide Release 3.0. VSN International, Hemel Hempstead.