Multivariate Fixed-Effects Regression of Treatment BLUEs Within Groups

The fixed-effects (BLUE) analogue of randomRegress(). For each group defined by the by argument, the function regresses the BLUEs of each conditioned treatment on the BLUEs of its conditioning set using ordinary least squares, returning a response index (regression residual) for every genotype.

For treatment $j$ with conditioning set $A_j$ of size $a = |A_j|$, the OLS fit is:

$$ \hat{\bm{\tau}}_j = \bm{X}_j \hat{\bm{\beta}}_j + \tilde{\bm{\tau}}_j, \qquad \bm{X}_j = \bigl[\bm{1},\; \hat{\bm{\tau}}_{A_j}\bigr] $$

where $\hat{\bm{\tau}}_j$ and $\hat{\bm{\tau}}_{A_j}$ are the $n$-vectors of predicted values (BLUEs) for treatment $j$ and its conditioning set across genotypes. The response index $\tilde{\bm{\tau}}_j = \bar{\bm{H}}_j \hat{\bm{\tau}}_j$ is the vector of OLS residuals, where $\bar{\bm{H}}_j = \bm{I} - \bm{X}_j(\bm{X}_j^\top\bm{X}_j)^{-1}\bm{X}_j^\top$ is the hat-matrix annihilator. Its variance–covariance matrix is $\sigma_j^2\bar{\bm{H}}_j$ with $df = n - a - 1$ degrees of freedom.

The same four conditioning schemes as randomRegress() are available via the type argument:

"baseline" (default): Each non-first treatment regressed on levs[1] alone (simple regression, $df = n-2$).
"sequential": Treatment $j$ regressed on all preceding treatments levs[1:(j-1)] (multiple regression). By the Gram–Schmidt property, residuals from treatment $j$ are uncorrelated with all previous treatment BLUEs.
"partial": Each treatment regressed on all other treatments simultaneously. Residuals are partial regression residuals, uncorrelated with the conditioning set but not necessarily with each other.
"custom": Conditioning set for each treatment supplied explicitly via cond.

Usage

fixedRegress(
  model,
  term = "Treatment:Genotype",
  by = NULL,
  levs = NULL,
  type = "baseline",
  cond = NULL,
  min_obs = NULL,
  ...
)

Arguments

model: An ASReml-R V4 model object.
term: Character string specifying the classify term containing the treatment and genotype factors, e.g. "Treatment:Genotype" or "Treatment:Site:Genotype".
by: Optional character string naming a factor in term to split the analysis by (e.g. "Site"). Separate regressions are performed within each level of by. When NULL (default) the regression is performed over all observations jointly.
levs: Character vector of length $\ge 2$ giving the treatment labels. The first element is the baseline for type = "baseline" and type = "sequential". Cannot be NULL.
type: Conditioning scheme. One of "baseline" (default), "sequential", "partial", or "custom". See Description.
cond: Named list required when type = "custom". Each name must be a treatment label from levs; each value is NULL (unconditional) or a character vector of conditioning treatment labels. Treatments absent from cond are treated as unconditional.
min_obs: Minimum number of genotypes with non-missing BLUEs in all required treatments for a regression to be attempted. Defaults to $\max(5,\; 2(|A_j|_{\max} + 1))$. Groups below this threshold produce a warning and are omitted.
...: Additional arguments forwarded to asreml::predict.asreml().

Value

A named list with the following elements:

blues: Data frame with columns: the grouping variable (or "Group" when by = NULL), Genotype, one raw BLUE column per element of levs, one resp.<lev> column per conditioned treatment (response indices / OLS residuals), one se.<lev> column (residual standard errors), and one HSD.<lev> column (Tukey HSD for pairwise genotype comparisons on the response-index scale).
beta: Named list of length $n_{\text{cond}}$. Each element beta[["<lev>"]] is a data frame with the grouping variable plus one column per conditioning treatment, giving the OLS regression coefficients estimated within each group.
sigmat: Numeric matrix of dimensions $n_{\text{groups}} \times n_{\text{cond}}$ containing the residual standard deviation $\sigma_j$ from each OLS regression.
cond_list: The resolved conditioning structure (named list).
type: The type argument used.

Examples

if (FALSE) { # \dontrun{
## Baseline: T1 and T2 each regressed on T0 within each Site
res_base <- fixedRegress(model,
                            term = "Treatment:Site:Genotype",
                            by   = "Site",
                            levs = c("T0","T1","T2"))

## Sequential: T1|T0, T2|{T0,T1} -- Gram-Schmidt orthogonal residuals
res_seq  <- fixedRegress(model,
                            term = "Treatment:Site:Genotype",
                            by   = "Site",
                            levs = c("T0","T1","T2"),
                            type = "sequential")

## Partial: each treatment vs all others within each site
res_part <- fixedRegress(model,
                            term = "Treatment:Site:Genotype",
                            by   = "Site",
                            levs = c("T0","T1","T2"),
                            type = "partial")

## Custom: T0 unconditional, T1|T0, T2|{T0,T1}
res_cust <- fixedRegress(model,
                            term = "Treatment:Site:Genotype",
                            by   = "Site",
                            levs = c("T0","T1","T2"),
                            type = "custom",
                            cond = list(T0 = NULL,
                                        T1 = "T0",
                                        T2 = c("T0","T1")))
} # }

Multivariate Fixed-Effects Regression of Treatment BLUEs Within Groups

Usage

Arguments

Value

See also

Examples