Factor Analytic Selection Tools: FAST and iClass

fastIC() and plot_fastIC()

Author

Julian Taylor

Published

May 9, 2026

$biomAid$

Overview

fastIC() provides a unified implementation of two complementary approaches for summarising variety performance from a Factor Analytic (FA) linear mixed model fitted in ASReml-R V4:

FAST (Factor Analytic Selection Tools; Smith & Cullis, 2018) — global Overall Performance and stability across all environments.
iClass (interaction class; Smith et al., 2021) — within-class performance and stability for environments grouped by their loading sign patterns.

Both frameworks are treated as a single continuum. FAST global metrics (GlobalOP, GlobalRMSD) are always computed across all environments. iClass metrics (iclass, iClassOP, iClassRMSD) partition environments by the sign pattern of their first ic.num rotated loadings. When ic.num = 1 and — as is typical after rotation — all first-factor loadings are positive, there is exactly one iClass and the two frameworks agree exactly: iClassOP = GlobalOP and iClassRMSD = GlobalRMSD.

plot_fastIC() provides seven complementary visualisations:

Type	Description
`"fast"`	Global OP vs stability (GlobalRMSD) scatter with quadrant annotations
`"biplot"`	FA biplot: genotype scores + environment loading arrows
`"CVE"`	Common Variety Effect heatmap (genotype × environment)
`"VAF"`	Variance Accounted For per environment: stacked 100% bar chart by FA factor
`"iclass"`	iClassOP vs iClassRMSD scatter, faceted by interaction class
`"OP.pairs"`	Lower-triangular pairs plot of iClassOP across all classes
`"OP.variety"`	iClassOP trajectory for highlighted varieties across classes

Mathematical Framework

Factor Analytic mixed model

An FA(k) random structure decomposes the t \times t genetic covariance matrix as \mathbf{G} = \hat{\boldsymbol{\Lambda}}\hat{\boldsymbol{\Lambda}}^\top + \hat{\boldsymbol{\Psi}}, where \hat{\boldsymbol{\Lambda}} \in \mathbb{R}^{t \times k} are the varimax-rotated loadings (element \hat{\lambda}_{rj}: factor r, environment j) and \hat{\boldsymbol{\Psi}} = \operatorname{diag}(\hat{\psi}_1,\ldots,\hat{\psi}_t) is the diagonal specific-variance matrix. After rotation, factor 1 captures the dominant non-crossover GEI with typically all-positive loadings; higher factors are bipolar. ASExtras4::fa.asreml() additionally returns score EBLUPs \hat{\mathbf{F}} \in \mathbb{R}^{n \times k} (element \hat{f}_{rg}: factor r, variety g).

The Common Variety Effect for variety g in environment j is:

\widehat{\text{CVE}}(g,j) = \hat{\boldsymbol{\lambda}}_j^\top \hat{\mathbf{f}}_g = \sum_{r=1}^{k} \hat{\lambda}_{rj}\,\hat{f}_{rg}, \qquad \widehat{\mathbf{CVE}} = \hat{\mathbf{F}}\hat{\boldsymbol{\Lambda}}^\top \in \mathbb{R}^{n \times t}.

FAST global metrics

Let \bar{\lambda}_{1} = t^{-1}\sum_j \hat{\lambda}_{1j} be the mean of the first-factor loadings. The global metrics for variety g are:

\text{GlobalOP}(g) = \bar{\lambda}_{1}\,\hat{f}_{1g}, \qquad \text{GlobalRMSD}(g) = \sqrt{t^{-1}\sum_j \bigl(\widehat{\text{CVE}}(g,j) - \hat{\lambda}_{1j}\hat{f}_{1g}\bigr)^2}.

global_op is on the trait scale (expected performance at an average environment); global_stab is the RMSD of the first-factor residual — small for broadly adapted varieties, large for environment-specific responders.

iClass metrics

Each environment is assigned a class label from the sign pattern of its first q = \texttt{ic.num} loadings:

\text{iClass}(j) = \bigotimes_{r=1}^{q} \begin{cases} \textup{p} & \hat{\lambda}_{rj} \ge 0 \\ \textup{n} & \hat{\lambda}_{rj} < 0 \end{cases}.

For class \omega with environment index set \mathbf{E}_\omega, let \bar{\boldsymbol{\lambda}}_\omega = |\omega|^{-1}\sum_{j\in\mathbf{E}_\omega}(\hat{\lambda}_{1j},\ldots,\hat{\lambda}_{qj})^\top be the class mean loading vector. Then:

\text{iClassOP}(g,\omega) = \sum_{r=1}^{q} \bar{\lambda}_{r\omega}\,\hat{f}_{rg}, \qquad \text{iClassRMSD}(g,\omega) = \sqrt{|\omega|^{-1}\sum_{j\in\mathbf{E}_\omega} \Bigl(\widehat{\text{CVE}}(g,j) - \sum_{r=1}^{q}\hat{\lambda}_{rj}\hat{f}_{rg}\Bigr)^2}.

iClassRMSD uses the residual from the first q factors; the kth factor is reserved for this purpose, so 1 \le \texttt{ic.num} \le k-1.

FAST is the special case q = 1 with one class spanning all environments: \text{iClassOP} = \text{GlobalOP} and \text{iClassRMSD} = \text{GlobalRMSD}. When q \ge 2, environments split into classes and iClassOP captures within-class performance, revealing crossover GEI invisible to GlobalOP.

Variance Accounted For (VAF)

The total genetic variance for environment j is \sigma^2_j = \sum_r \hat{\lambda}_{rj}^2 + \hat{\psi}_j. The per-factor and specific proportions are:

\text{VAF}(r,j) = \frac{\hat{\lambda}_{rj}^2}{\sigma^2_j}, \qquad \text{Specific}(j) = \frac{\hat{\psi}_j}{\sigma^2_j}, \qquad \sum_{r=1}^{k}\text{VAF}(r,j) + \text{Specific}(j) = 1.

The overall VAF for factor r pools across all environments: \overline{\text{VAF}}(r) = \sum_j\hat{\lambda}_{rj}^2 \,/\, \sum_j\sigma^2_j.

Example 1 — Balanced MET: 50 Varieties × 12 Sites

Data simulation

Requires: ASReml-R V4 and ASExtras4. The vignette is pre-rendered so no licence is needed to read it; run the code chunks only if you have the packages installed.

We simulate a balanced MET with pronounced Site × Variety interaction by using a narrow correlation range for the genetic covariance matrix. With g_cor_min = 0.10 and g_cor_max = 0.65, genetic correlations span a wide range, generating both positive-positive and positive-negative loading patterns on higher FA factors — the precondition for meaningful iClass structure.

set.seed(101L)
sim1 <- simTrialData(
  nvar        = 50L,
  nsite       = 12L,
  nrep        = 3L,
  G           = "auto",
  incidence   = "balanced",
  seed        = 101L,
  verbose     = FALSE,
  sim.options = list(
    g_cor_min     = 0.10,
    g_cor_max     = 0.65,
    sigma_genetic = 300,
    error_sd      = 280,
    site_mean     = 4500,
    site_sd       = 450
  )
)
dat1         <- sim1$data
dat1$Site    <- factor(dat1$Site)
dat1$Variety <- factor(dat1$Variety)
dat1$Rep     <- factor(dat1$Rep)
dat1$Row     <- factor(dat1$Row)
dat1$Column  <- factor(dat1$Column)

A quick look at the true genetic correlation structure confirms meaningful variability — some environments are quite poorly correlated with others, providing scope for crossover interaction.

plot_simTrialData(sim1, type = "correlation")

Fitting FA(1), FA(2) and FA(3) models

We fit three models of increasing complexity, using at(Site):Rep for replicate effects within sites:

m1_fa1 <- asreml(
  yield  ~ Site,
  random = ~ fa(Site, 1):Variety + at(Site):Rep,
  data   = dat1,
  maxit  = 30L
)
m1_fa1 <- update(m1_fa1)

m1_fa2 <- asreml(
  yield  ~ Site,
  random = ~ fa(Site, 2):Variety + at(Site):Rep,
  data   = dat1,
  maxit  = 30L
)
m1_fa2 <- update(m1_fa2)

m1_fa3 <- asreml(
  yield  ~ Site,
  random = ~ fa(Site, 3):Variety + at(Site):Rep,
  data   = dat1,
  maxit  = 30L
)
m1_fa3 <- update(m1_fa3)

AIC confirms that the FA(3) model provides the best fit:

.aic <- function(m) -2 * m$loglik + 2 * nrow(summary(m)$varcomp)
data.frame(
  Model  = c("FA(1)", "FA(2)", "FA(3)"),
  LogLik = round(c(m1_fa1$loglik, m1_fa2$loglik, m1_fa3$loglik), 2),
  npar   = c(nrow(summary(m1_fa1)$varcomp),
             nrow(summary(m1_fa2)$varcomp),
             nrow(summary(m1_fa3)$varcomp)),
  AIC    = round(c(.aic(m1_fa1), .aic(m1_fa2), .aic(m1_fa3)), 2)
)

  Model    LogLik npar      AIC
1 FA(1) -11433.93   37 22941.86
2 FA(2) -11404.90   49 22907.80
3 FA(3) -11391.55   61 22905.10

Running `fastIC()`

We run fastIC() on the FA(3) model with ic.num = 2. The first two rotated factors define the iClass labels — environments with loading sign pattern "pp" (positive on both factors) form one class, those with pattern "pn" form another. Factor 3 is reserved for iClassRMSD.

res1 <- fastIC(
  model  = m1_fa3,
  term   = "fa(Site, 3):Variety",
  ic.num = 2L
)

# iClass assignments for each site
unique(res1[, c("Site", "iclass")])[order(unique(res1[, c("Site","iclass")])$iclass), ]

     Site iclass
1   Env01     pn
51  Env02     pn
101 Env04     pn
151 Env05     pn
201 Env06     pn
251 Env11     pn
301 Env12     pn
351 Env03     pp
401 Env07     pp
451 Env08     pp
501 Env09     pp
551 Env10     pp

cat("iClass sizes (number of environments):\n")

iClass sizes (number of environments):

print(table(unique(res1[, c("Site","iclass")])$iclass))


pn pp 
 7  5

The two iClasses partition sites by whether they respond positively or negatively to the second FA factor — a measure of crossover GEI that is invisible from the global rankings alone.

Variance Accounted For

The fastIC()1 function extracts information fromASExtras4::fa.asreml()that can be useful to visualise. The“VAF”` plot below shows, for each environment, what proportion of the total genetic variance (\sigma^2_j = \sum_r \hat{\lambda}_{rj}^2 + \hat{\psi}_j) is accounted for by each FA factor versus left in the specific variance. Each bar reaches 100%; Factor 1 (darkest blue) occupies the bottom and typically dominates, with Factor 2 and Factor 3 adding further layers, and the specific variance (grey) at the top representing unexplained genetic variation. The dashed horizontal line marks the overall mean proportion explained across all environments.

plot_fastIC(res1, type = "VAF")

FAST: Global OP vs Stability

The FAST plot provides the traditional entry point. Varieties in the upper-right quadrant are high-performing and broadly adapted; those in the upper-left combine good average performance with more environment-specific response.

plot_fastIC(res1, type = "fast", n_highlight = 5L)

FA biplot

The biplot places genotypes in the space of factor scores (1 vs 2) and environments as arrows scaled by their loadings. Environments coloured by iClass visually confirm the separation: "pp" sites point into the positive upper-right; "pn" sites point into the positive-negative quadrant. Varieties closest to an environment arrow tend to perform well in that environment.

plot_fastIC(res1, type = "biplot", n_highlight = 5L)

CVE heatmap

The Common Variety Effect heatmap shows the full genotype × environment performance surface. Environments are grouped by iClass (column facets) and ordered by first-factor loading within each class, while varieties are ordered by global_op from top (best) to bottom (poorest). The diverging palette makes crossover patterns immediately visible — varieties that are strongly positive in one class but near-zero or negative in the other.

plot_fastIC(res1, type = "CVE")

iClass: Within-class OP vs RMSD

Faceting by iClass separates the OP vs RMSD trade-off within each interaction class. The vertical dashed line marks the mean iClassOP within the class. Varieties above-right of the intersection are class champions — high within-class performance with low within-class instability.

plot_fastIC(res1, type = "iclass", n_highlight = 4L)

iClassOP pairs plot

With two iClasses the "OP.pairs" plot is a single scatter of iClassOP in class "pn" (x-axis) against class "pp" (y-axis). The dashed diagonal marks equal performance across classes. Varieties well above the line are specialists for "pp" environments; those below favour "pn" environments. Points near the diagonal are generalists.

plot_fastIC(res1, type = "OP.pairs", n_highlight = 5L)

Variety trajectory

The "OP.variety" plot traces each variety’s iClassOP across the ordered interaction classes. Highlighted varieties are selected by default as those with the largest range in iClassOP (i.e. the greatest crossover). Steep crossing lines indicate strong specialists; near-parallel lines indicate stable generalists.

plot_fastIC(res1, type = "OP.variety", n_highlight = 6L)

Example 2 — Unbalanced MET: 75 Varieties × 25 Sites

When the number of environments is larger and the genetic covariance structure is more complex, higher-order FA models and more interaction classes are needed to fully characterise GEI. Here we use an unbalanced incidence structure to reflect a realistic breeding programme where not all varieties are evaluated at every site.

Data simulation

set.seed(202L)
sim2 <- simTrialData(
  nvar        = 75L,
  nsite       = 25L,
  nrep        = 2L,
  G           = "auto",
  incidence   = "unbalanced",
  seed        = 202L,
  verbose     = FALSE,
  sim.options = list(
    g_cor_min     = 0.05,
    g_cor_max     = 0.60,
    sigma_genetic = 320,
    error_sd      = 300,
    site_mean     = 4500,
    site_sd       = 500,
    core_pct      = 0.25,
    core_min_sites_pct = 0.80,
    reg_min_sites_pct  = 0.35,
    reg_max_sites_pct  = 0.75
  )
)
dat2         <- sim2$data
dat2$Site    <- factor(dat2$Site)
dat2$Variety <- factor(dat2$Variety)
dat2$Rep     <- factor(dat2$Rep)
dat2$Row     <- factor(dat2$Row)
dat2$Column  <- factor(dat2$Column)

The unbalanced incidence creates a core set of varieties present across most sites and a peripheral set with sparser coverage:

plot_simTrialData(sim2, type = "incidence")

Connectivity summary:

inc <- sim2$params$incidence
cat("Varieties per site — mean:", round(mean(colSums(inc)), 1),
    " range [", min(colSums(inc)), ",", max(colSums(inc)), "]\n")

Varieties per site — mean: 46.8  range [ 38 , 54 ]

cat("Sites per variety  — mean:", round(mean(rowSums(inc)), 1),
    " range [", min(rowSums(inc)), ",", max(rowSums(inc)), "]\n")

Sites per variety  — mean: 15.6  range [ 9 , 20 ]

The true genetic correlation structure shows considerable variability across the 25 sites, with some site pairs sharing very low or even negative genetic correlations — precisely the conditions under which four FA factors and three iClasses are warranted.

plot_simTrialData(sim2, type = "correlation")

Fitting FA(2), FA(3) and FA(4) models

m2_fa2 <- asreml(
  yield  ~ Site,
  random = ~ fa(Site, 2):Variety + at(Site):Rep,
  data   = dat2,
  maxit  = 30L
)
m2_fa2 <- update(m2_fa2)

m2_fa3 <- asreml(
  yield  ~ Site,
  random = ~ fa(Site, 3):Variety + at(Site):Rep,
  data   = dat2,
  maxit  = 30L
)
m2_fa3 <- update(m2_fa3)

m2_fa4 <- asreml(
  yield  ~ Site,
  random = ~ fa(Site, 4):Variety + at(Site):Rep,
  data   = dat2,
  maxit  = 30L
)
m2_fa4 <- update(m2_fa4)

data.frame(
  Model  = c("FA(2)", "FA(3)", "FA(4)"),
  LogLik = round(c(m2_fa2$loglik, m2_fa3$loglik, m2_fa4$loglik), 2),
  npar   = c(nrow(summary(m2_fa2)$varcomp),
             nrow(summary(m2_fa3)$varcomp),
             nrow(summary(m2_fa4)$varcomp)),
  AIC    = round(c(.aic(m2_fa2), .aic(m2_fa3), .aic(m2_fa4)), 2)
)

  Model    LogLik npar      AIC
1 FA(2) -14981.57  101 30165.13
2 FA(3) -14960.77  126 30173.54
3 FA(4) -14942.11  151 30186.23

Running `fastIC()` with ic.num = 3

With a FA(4) model and ic.num = 3, the first three factors define up to 2^3 = 8 potential iClasses, though only those sign patterns that are actually represented by at least one environment will appear. Factor 4 is reserved for iClassRMSD.

res2 <- fastIC(
  model  = m2_fa4,
  term   = "fa(Site, 4):Variety",
  ic.num = 3L
)

# iClass assignments
ic_tab <- unique(res2[, c("Site", "iclass")])
ic_tab <- ic_tab[order(ic_tab$iclass, ic_tab$Site), ]
cat("iClass sizes (number of environments):\n")

iClass sizes (number of environments):

print(table(ic_tab$iclass))


pnn ppn ppp pnp 
  8   7   5   5

With three classification factors and 25 environments, several distinct interaction classes emerge. Classes reflect groups of environments that share the same pattern of positive/negative response on all three bipolar factors.

Variance Accounted For

With 25 environments and four FA factors, the VAF plot quickly reveals which sites are well captured by the common factor structure and which have substantial specific variance. Sites where the grey segment is large contribute variance beyond what the FA factors explain — these environments may warrant inspection for outlying observations or unusual site characteristics. The subtitle reports the overall proportion explained across all 25 sites.

plot_fastIC(res2, type = "VAF")

FAST: Global OP vs Stability

plot_fastIC(res2, type = "fast", n_highlight = 6L)

FA biplot

With ic.num = 3, each iClass is defined by the sign pattern of the first three loadings — up to 2^3 = 8 potential classes. The default biplot shows Factors 1 & 2 only, so environments whose iClass differs solely in the sign of Factor 3 will overlap in this view. A second biplot using biplot_factors = c(1L, 3L) reveals the Factor 3 separation.

plot_fastIC(res2, type = "biplot", n_highlight = 5L)

plot_fastIC(res2, type = "biplot", n_highlight = 5L,
            biplot_factors = c(1L, 3L))

CVE heatmap

With four iClasses the CVE heatmap is faceted into four columns, each grouping the sites that share the same three-factor loading sign pattern. The within-class rank ordering of varieties often differs strikingly from the global ranking, which motivates the iClass approach.

plot_fastIC(res2, type = "CVE")

iClass OP vs RMSD

Faceted across all iClasses, this plot highlights that the within-class RMSD for most varieties is considerably smaller than global_stab — evidence that the crossover GEI captured by the first three factors has been successfully isolated into the class structure.

plot_fastIC(res2, type = "iclass", n_highlight = 4L)

Lower-triangular iClassOP pairs plot

With four iClasses the "OP.pairs" plot becomes a 3 \times 3 lower-triangular matrix of pairwise iClassOP scatters — one panel for each combination of classes. Each panel uses the same diagonal reference line; points far from the diagonal are class specialists, while points near it perform similarly across both classes. The iClass structure is most informative when the cloud of points in each panel shows clear deviation from the diagonal in a consistent direction.

plot_fastIC(res2, type = "OP.pairs", n_highlight = 5L)

Variety trajectory

The trajectory plot is particularly striking with multiple iClasses. Each line traces a variety’s iClassOP from the first class (left) through to the last (right). Strongly crossing lines — and especially varieties that switch rank — illustrate precisely the crossover GEI that a global OP ranking misses. The highlighted varieties are automatically chosen as those with the greatest range in iClassOP.

plot_fastIC(res2, type = "OP.variety", n_highlight = 8L)

Summary

`fastIC()` arguments

Argument	Default	Description
`model`	—	Fitted ASReml-R V4 model object containing an FA random term
`term`	`"fa(Site, 4):Genotype"`	Full random-effect interaction string, e.g. `"fa(Env, 3):Variety"`. Supports `vm(...)` or `ide(...)` wrappers on the variety factor for relationship-aware models.
`ic.num`	`2L`	Number of factors used to form iClasses (1 \le \texttt{ic.num} \le k - 1); the kth factor is reserved for `iClassRMSD`.
`...`		Additional arguments forwarded to `ASExtras4::fa.asreml()`

`plot_fastIC()` types

Type	Description
`"fast"`	Global OP vs stability scatter with quadrant annotations. Auto-highlights top by `global_op` and top by `global_stab`.
`"biplot"`	FA biplot — genotype score points and environment loading arrows, coloured by iClass. Default axes are Factors 1 & 2; use `biplot_factors = c(i, j)` to display any two factor axes. Requires k ≥ 2.
`"CVE"`	Common Variety Effect heatmap (genotype × environment); environments grouped by iClass, genotypes ordered by `global_op`.
`"VAF"`	Stacked 100% bar chart of Variance Accounted For per environment: FA factors (sequential blues, bottom) plus specific variance (grey, top). Dashed line = overall mean explained.
`"iclass"`	iClassOP vs iClassRMSD scatter, faceted by interaction class; per-class mean reference line. Auto-highlights top by mean iClassOP.
`"OP.pairs"`	Lower-triangular pairs plot of iClassOP across all iClass levels; y = x reference diagonal. Requires ≥ 2 classes. Uses `patchwork` when available.
`"OP.variety"`	iClassOP trajectory across ordered interaction classes; highlighted varieties over grey background. Auto-highlights by range of iClassOP.

All types support highlight = "default" | character | NULL and return_data = TRUE returning list($plot, $data). The biplot_factors argument (default c(1L, 2L)) selects which two FA factor axes to display in the biplot.

References

Smith, A.B. & Cullis, B.R. (2018). Plant breeding selection tools built on factor analytic mixed models for multi-environment trial data. Euphytica, 214, 143.

Smith, A.B., Norman, A., Kuchel, H. & Cullis, B.R. (2021). Plant variety selection using interaction classes derived from factor analytic linear mixed models: models with independent variety effects. Frontiers in Plant Science, 12, 737462.

Overview

Mathematical Framework

Factor Analytic mixed model

FAST global metrics

iClass metrics

Variance Accounted For (VAF)

Example 1 — Balanced MET: 50 Varieties × 12 Sites

Data simulation

Fitting FA(1), FA(2) and FA(3) models

Running fastIC()

Variance Accounted For

FAST: Global OP vs Stability

FA biplot

CVE heatmap

iClass: Within-class OP vs RMSD

iClassOP pairs plot

Variety trajectory

Example 2 — Unbalanced MET: 75 Varieties × 25 Sites

Data simulation

Fitting FA(2), FA(3) and FA(4) models

Running fastIC() with ic.num = 3

Variance Accounted For

FAST: Global OP vs Stability

FA biplot

CVE heatmap

iClass OP vs RMSD

Lower-triangular iClassOP pairs plot

Variety trajectory

Summary

fastIC() arguments

plot_fastIC() types

References

Running `fastIC()`

Running `fastIC()` with ic.num = 3

`fastIC()` arguments

`plot_fastIC()` types