|
Post by lcweng on Mar 24, 2016 1:01:58 GMT
Hi,
I am a beginner of GCTA, and I am working on a project for heritability estimation on disease outcomes. In my testing population, I got a different h2g estimation (higher) after LD-pruning, so I am wondering what would be the better dataset for h2g estimation (whole imputation dataset or LD-prunned dataset).
1. Should I removed all SNPs in LD (e.g., r2>0.3, 0.6, 0.8, etc) from my dataset?
2. If there are different LD properties for causal SNPs, but the study sample size is too small for GREML-LDMS. Should I use a LD-prunned data in h2g estimation?
Thank you.
|
|
|
Post by Jian Yang on Mar 26, 2016 4:45:32 GMT
Re 1) Not recommended.
Re 2) As shown in Yang et al. 2015 NG (LDMS paper), the bias in h2g due to heterogeneity in LD is small. You might also see supplementary figure 3 of Yang et al. 2015 NG. It shows that the results are also robust to the number of components. So, if the sample size is not large enough, I would run a two-component GREML: common and rare.
|
|