How can I estimate the fixed effects from GCTA-GREML?

How can I estimate the fixed effects from GCTA-GREML? Sept 21, 2015 11:31:30 GMT

Post by Jian Yang on Sept 21, 2015 11:31:30 GMT

For an analysis without a covariate, the GREML model can be written as
y = mu + g + e
where mu is the mean term (fixed effect), g is the genetic value (random effect) and e is the residual.

1. Categorical covariate (e.g. sex and cohort): --covar option
If the covariate is a categorical covariate, there will be t - 1 variables (where t is the number of categories, e.g. t = 2 for sex) because otherwise the X^TV^-1X will not be invertible (X is design matrix for the fixed effects and V is the covariance-covariance matrix). Therefore, the model can be written as
y = mu + x_c(2)*b_c(2) + x_c(3)*b_c(3) + ... + x_c(t)*b_c(t) + g + e
where x is coded as 1 or 0 (representing the presence or absence of a category), b_c(i) is interpreted as difference in mean phenotype in category i from the category 1. Note that the order of the categories are determined by their order of appearance in the covariate file.

2. Quantitative covariate (e.g. age): --qcovar option
The covariate is fitted as a continuous variable, then the model is
y = mu + x_q(1)*b_q(1) + g + e
where the interpretation of b_q(1) is similar as that from a linear regression.

3. If we have a categorical covariate and two quantitative covariates, the model is
y = mu + x_c(2)*b_c(2) + x_c(3)*b_c(3) + ... + x_c(t)*b_c(t) + x_q(1)*b_q(1) + x_q(2)*b_q(2) + g + e

Of course, we could also fit multiple quantitative covariates and multiple categorical covariates.

These fixed effects can be estimated using the --reml-est-fix option in a REML analysis. The estimates are shown in the log output following the order in the model above, i.e. the effect of each quantitative covariate followed by the effect each of category of the categorical covariates.