
Post by Jian Yang on Sept 15, 2015 1:06:08 GMT
It is not recommended to run a GCTAGREML analysis in a small sample. When the sample size is small, the sampling variance (standard error squared) of the estimate is large (see GCTAGREML power calculator), so the estimate of SNPheritability (h2SNP) will fluctuate a lot and could even hit the boundary (0 or 1). Therefore, when the sample size is small, it is not surprising to observe an estimate of SNPheritability being 0 or 1 (with a large standard error). If the estimate hits the boundary (0 or 1), the phenotypic variancecovariance matrix (V) will often become invertible and you will see error message "Error: the variancecovaraince matrix V is not positive definite" or the REML analysis is not converged with an error message "Loglikelihood not convergedâ€ť Q1: How many samples are required for a GCTAGREML analysis?A1: For unrelated individuals and common SNPs, you will need at least 3160 unrelated samples to get a SE down to 0.1 (see Visscher et al. 2014 PLoS Genet). For GREML analysis with multiple GRMs and/or GRM(s) computed from 1000G imputed data, a much larger sample size is required (see Yang et al. 2015 Nat Genet). Q2: Why do I need a small standard error (SE)?A2: The 95% confidence interval (CI) is approximately h2SNP estimate + 1.96 * SE. If the SE is too large, the 95% CI will cover the whole parameter space (from 0 to 1) so that you won't be able to make any meaningful inference from the estimate.

