|
Post by qserenali on Mar 29, 2014 0:45:36 GMT
Given the earlier posting that SE~300/N and I have a sample size of ~1000, is it even reasonable to interpret variance explained by SNPs when the SE is ~.3. in fact, one of the collaborators estimated and came up with h2 of ~.3 (.297 and 0.327, respectively, for 2 traits) and SE of ~0.3 (0.292 and 0.289, respectively - in the same order). What's the best interpretation 1) inconclusive; 2) limited evidence of polygenic contribution; or 3) no evidence of polygenic contribution? I would like to try to recalculate the #s taking into the best practice into consideration. 1) I have 4 non-genetic covariates to correct for, it was suggested that I should correct for these covariates first and feed the residuals into the GCTA-GREML analysis, correct? 2) I wonder whether one should prune the SNPs first to remove the correlation between SNPs. I have both directly genotyped markers (1M) and imputed (including directly genotyped) markers. If pruning is recommended, would the following setting be reasonable? Depending the R^2 threshold recommended, there may not be any reason to use imputed set for this reason at all.
plink --bfile ${DATASET} --indep-pairwise 1500 150 0.2 --out ${DATASET}_pruned
Any other things that I shall be paying attention to? Thanks a lot!
|
|
|
Post by Zhihong Zhu on Apr 1, 2014 6:02:30 GMT
I think the h^2 is still unknown, because of the large confidence interval, 0 - 0.9 (0.3+2*0.3, mean + 2*se).
SE sample size 0.3 1,000 0.15 2,000 0.07 4,000
I didn't prune the SNPs when performing the estimation. I just use QCed common SNPs, ie. removing SNPs with HWE p < 1e-6, MAF < 0.01, and imputation R^2 < 0.6 (imputed SNPs), excluding samples with relationship > 0.025
|
|