In a recent publication entitled “Limitations of GCTA as a solution to the missing heritability problem” Krishna Kumar et al. (PNAS 2015) claim that “GCTA applied to current SNP data cannot produce reliable or stable estimates of heritability”.
We show that those claims are false and that results presented by Krishna Kumar et al. are in fact entirely consistent with and can be predicted from the theory underlying GCTA.
Click the link below to download a full commentary on the Krishna Kumar et al. paper (drafted on 13/01/2016; revised on 30/01/2016).
Thank you for asking. Please see below for a summary description of the singular values shown in Figure 1e of our commentary.
Min Max Mean Variance 0.57 2.7 0.97 0.066
Note that these are the square roots of eigenvalues from a principal component analysis of the GRM, which are equivalent to the singular value of Z / sqrt(m) with m being the number of SNPs (m = 35,221 in our Figure 1e). We have clarified this in our revised commentary (Commentary on Krishna Kumar et al. PNAS 2015.pdf; please also see the link above).
It will be greatly appreciated if you can make the list of SNPs and individuals used in the Krishna Kumar et al. (2015 PNAS) paper publicly available so that readers like us can repeat your results. We did not observe a large number of zero singular values as shown in Figure 3 of Krishna Kumar et al. in the same data. I presume the number of SNPs used in Figure 3 of Krishna Kumar et al. is likely to be smaller than sample size.
Nevertheless, I think this is not the main issue. As we said in our commentary, the main problem of the Krishna Kumar et al. paper is a misunderstanding of the theory behind GCTA-GREML. That is, the authors believe that GREML does not take LD between SNPs into account, and therefore mistakenly think that the estimated variance of SNP effects (or its sampling variation) should be the same regardless of the number of SNPs fitted in the model.
Thanks for letting me know that. It seems that you still don't understand that a random-effect model of fitting multiple variables accounts for correlations between the variables. Nevertheless, thanks again for your original paper and subsequent response to our commentary. These discussion are valuable for the field to better understand the GCTA-GREML method and more broadly the random-effect models since there is obviously confusion about such models.