Post by Jian Yang on Sept 15, 2015 3:07:49 GMT
Yang et al. (2015) Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nature Genetics, 47:1114–1120.
Summary
1) GREML-LDMS can provide an unbiased estimate of heritability using whole genome sequencing data regardless of the MAF and LD properties of the causal variants.
2) ~97% of variation at common sequence variants and ~68% of variation at rare variants can be captured by SNP-array based genotyping followed by 1000G imputation, irrespective of the types of SNP arrays used.
3) The narrow-sense heritability is likely to be 60~70% for height and 30%~40% for BMI, the majority of which can be explained by all the 1000G imputed variants. Therefore, the missing heritability for either height or BMI is small (negligible).
4) Height and BMI associated loci have been under natural selection.
Addendum
1) In the legend of Figure 4, Ppermutation should be P. As described in the Online Methods section, the p-value was calculated from a re-sampling analysis (strictly speaking, not permutation). Under the null hypothesis that there is no correlation between MAE (minor allele effect) and MAF (minor allele frequency), the observed correlation between mean(MAE) and mean[log10(MAF)] across the MAF bins should not be significantly different from the correlation between mean[log10(MAF)] of the SNPs in a MAF bin and mean(MAE) of the same number of SNPs randomly sampled from the whole genome. Therefore, the p-value is calculated from comparing the observed correlation value to the re-sampled correlation values (in each MAF bin, MAE of the SNPs were randomly sampled from all the SNPs) for 1 million times.
2) In figure 3a, the y-axis represents the proportion of phenotypic variance explained. However, there is confusing sentence in the figure legend saying that "Without filtering variants for IMPUTE-INFO score (columns in orange), the sum of the estimates was 96.2% for common variants and 73.4% for rare variants." These are actually the estimates of proportion of genetic variance explained by the imputed SNPs, i.e. h21KGP / h2, as defined on the y-axis of figure 4b.
Summary
1) GREML-LDMS can provide an unbiased estimate of heritability using whole genome sequencing data regardless of the MAF and LD properties of the causal variants.
2) ~97% of variation at common sequence variants and ~68% of variation at rare variants can be captured by SNP-array based genotyping followed by 1000G imputation, irrespective of the types of SNP arrays used.
3) The narrow-sense heritability is likely to be 60~70% for height and 30%~40% for BMI, the majority of which can be explained by all the 1000G imputed variants. Therefore, the missing heritability for either height or BMI is small (negligible).
4) Height and BMI associated loci have been under natural selection.
Addendum
1) In the legend of Figure 4, Ppermutation should be P. As described in the Online Methods section, the p-value was calculated from a re-sampling analysis (strictly speaking, not permutation). Under the null hypothesis that there is no correlation between MAE (minor allele effect) and MAF (minor allele frequency), the observed correlation between mean(MAE) and mean[log10(MAF)] across the MAF bins should not be significantly different from the correlation between mean[log10(MAF)] of the SNPs in a MAF bin and mean(MAE) of the same number of SNPs randomly sampled from the whole genome. Therefore, the p-value is calculated from comparing the observed correlation value to the re-sampled correlation values (in each MAF bin, MAE of the SNPs were randomly sampled from all the SNPs) for 1 million times.
2) In figure 3a, the y-axis represents the proportion of phenotypic variance explained. However, there is confusing sentence in the figure legend saying that "Without filtering variants for IMPUTE-INFO score (columns in orange), the sum of the estimates was 96.2% for common variants and 73.4% for rare variants." These are actually the estimates of proportion of genetic variance explained by the imputed SNPs, i.e. h21KGP / h2, as defined on the y-axis of figure 4b.