sample size and should one thin the SNPs first

sample size and should one thin the SNPs first Mar 29, 2014 0:45:36 GMT

Quote

Post by qserenali on Mar 29, 2014 0:45:36 GMT

Given the earlier posting that SE~300/N and I have a sample size of ~1000, is it even reasonable to interpret variance explained by SNPs when the SE is ~.3. in fact, one of the collaborators estimated and came up with h2 of ~.3 (.297 and 0.327, respectively, for 2 traits) and SE of ~0.3 (0.292 and 0.289, respectively - in the same order). What's the best interpretation 1) inconclusive; 2) limited evidence of polygenic contribution; or 3) no evidence of polygenic contribution? I would like to try to recalculate the #s taking into the best practice into consideration. 1) I have 4 non-genetic covariates to correct for, it was suggested that I should correct for these covariates first and feed the residuals into the GCTA-GREML analysis, correct? 2) I wonder whether one should prune the SNPs first to remove the correlation between SNPs. I have both directly genotyped markers (1M) and imputed (including directly genotyped) markers. If pruning is recommended, would the following setting be reasonable? Depending the R^2 threshold recommended, there may not be any reason to use imputed set for this reason at all.

plink --bfile ${DATASET} --indep-pairwise 1500 150 0.2 --out ${DATASET}_pruned

Any other things that I shall be paying attention to? Thanks a lot!

Post by qserenali on Mar 29, 2014 0:45:36 GMT

Post by Zhihong Zhu on Apr 1, 2014 6:02:30 GMT