Post by yangis on Dec 19, 2017 11:49:07 GMT
Hi,
I have a question about the implementation of the GCTA-GREML method with multiple GRMs.
A basic analysis of my data (about 10,000 individuals & 500,000 SNPs after QC) using one GRM (all SNPs/individuals) gives me a heritability of about 0.15 (i.e. Sum of V(G)/Vp = 0.15).
When I split the genetic variance into 22 parts (1 GRM per autosome) and run the model using the --mgrm option, I obtain a similar result: the sum of V(G)/Vp from the 22 chromosome is about 0.14 (the difference with the first model being likely due to the SE). No problem, everything is working perfectly, and the chromosomes that I expected to have a larger variance actually did. Perfect! (I'm so grateful for this software!)
Then I further tried to split the genetic variance into different parts, using 2 GRMs: One made from a 'window' of 100 adjacent SNPs, the other from the rest of the SNPs (~500,000). I repeated the process over the entire genome, one window at a time.
For most 'windows', I obtained something like V(G1)/Vp=0.0002 (for the 100 SNP 'window') and V(G2)/Vp=0.1498 (for the rest). I am simplifying the numbers for the sake of clarity, but this is expected given the small window size: each window is only expected to contribute very modestly to the overall heritability (at least for non-significant windows), and the whole thing is expected to approximately add up to 0.15, i.e. the heritability from the first model I described. So far, so good.
However, for windows for which I obtained a high LRT (compared to the model without the window) (significant windows?), and a p-value < 1*10e-05, I got very different results: V(G1)/Vp=0.6~0.8 (60%+ for the 100 SNP 'window'!!) and V(G2)/Vp=0.002 (for the rest).
I would expect windows that contain causal variants to have a higher V(G1)/Vp than most windows, but certainly not that high, and the sum of V(G)/Vp should still be around 0.15.
I have trouble making sense of this result... This happened for many windows (this is not just one outlier), sometimes for models/windows for which I obtained a LRT whose p-value is very low, and sometimes for models/windows for which I obtained a LRT of 0 and p-value of 0.5; but in all cases, a LogL very different from LogL0 (LogL0 being the log-likelihood for the model without the window variance).
I also have trouble understanding how adding the window variance to the model somehow reduces the log-likelihood... leading to 1) a LogL quite different from LogL0, and 2) LRT = 0.
Changing window size does not seem to improve the results. Neither does removing related individuals from the original data using --grm-cutoff.
Do you have any idea why this happens/how to solve the problem?
In any case, thank you so much for your great work with the software!
I have a question about the implementation of the GCTA-GREML method with multiple GRMs.
A basic analysis of my data (about 10,000 individuals & 500,000 SNPs after QC) using one GRM (all SNPs/individuals) gives me a heritability of about 0.15 (i.e. Sum of V(G)/Vp = 0.15).
When I split the genetic variance into 22 parts (1 GRM per autosome) and run the model using the --mgrm option, I obtain a similar result: the sum of V(G)/Vp from the 22 chromosome is about 0.14 (the difference with the first model being likely due to the SE). No problem, everything is working perfectly, and the chromosomes that I expected to have a larger variance actually did. Perfect! (I'm so grateful for this software!)
Then I further tried to split the genetic variance into different parts, using 2 GRMs: One made from a 'window' of 100 adjacent SNPs, the other from the rest of the SNPs (~500,000). I repeated the process over the entire genome, one window at a time.
For most 'windows', I obtained something like V(G1)/Vp=0.0002 (for the 100 SNP 'window') and V(G2)/Vp=0.1498 (for the rest). I am simplifying the numbers for the sake of clarity, but this is expected given the small window size: each window is only expected to contribute very modestly to the overall heritability (at least for non-significant windows), and the whole thing is expected to approximately add up to 0.15, i.e. the heritability from the first model I described. So far, so good.
However, for windows for which I obtained a high LRT (compared to the model without the window) (significant windows?), and a p-value < 1*10e-05, I got very different results: V(G1)/Vp=0.6~0.8 (60%+ for the 100 SNP 'window'!!) and V(G2)/Vp=0.002 (for the rest).
I would expect windows that contain causal variants to have a higher V(G1)/Vp than most windows, but certainly not that high, and the sum of V(G)/Vp should still be around 0.15.
I have trouble making sense of this result... This happened for many windows (this is not just one outlier), sometimes for models/windows for which I obtained a LRT whose p-value is very low, and sometimes for models/windows for which I obtained a LRT of 0 and p-value of 0.5; but in all cases, a LogL very different from LogL0 (LogL0 being the log-likelihood for the model without the window variance).
I also have trouble understanding how adding the window variance to the model somehow reduces the log-likelihood... leading to 1) a LogL quite different from LogL0, and 2) LRT = 0.
Changing window size does not seem to improve the results. Neither does removing related individuals from the original data using --grm-cutoff.
Do you have any idea why this happens/how to solve the problem?
In any case, thank you so much for your great work with the software!