Estimate the variance explained by common and rare variants

Lars F
New Member

Posts: 1

Estimate the variance explained by common and rare variants Jul 24, 2014 17:31:07 GMT

Quote

Post by Lars F on Jul 24, 2014 17:31:07 GMT

Dear Jiang,

thank you very much for developing GCTA! Truly amazing work!

I'm interested to answer the question "how much of the variance explained is attributable to common or to rare variants?".
I have split my case control genotype data into 170k rare variants and 250k common variants and generated both kinship matrices for > 30k unrelated samples (>15k cases and >15k controls).

I ran an REML analysis and fitted both matrices by using "--mgrm".
Also I also ran a similar analysis just using the single grm that I obtained after merging the matrices using "--mgrm" in combination with "--make-grm".

The puzzling result was that the "Sum of V(G)/Vp" obtained from the first analysis is much smaller than "V(G)/Vp" from the second analysis [0.55 (SE = 0.013) and 0.69 (SE = 0.013) ].

I now have three questions for you (if I may bug you with so many questions):
- Do you have an explanation for this observation?
- Is it reasonable to estimate the variance explained using rare variants?
- If so, what would be your recommended setting for this kind of analysis?

I would very much appreciate your expert opinion on this.

Thank you very much for your help!

Best regards from Michigan,
Lars

mrxib3
New Member

Posts: 1

Estimate the variance explained by common and rare variants Sept 30, 2015 13:26:28 GMT

Quote

Post by mrxib3 on Sept 30, 2015 13:26:28 GMT

Hi,

I'm also interested in rare variants and would like to ask the same question. I'm working with majority rare variants (80% of my data has a MAF < 0.01). Is GCTA appropriate for this kind of data? If not, is 51,000 common SNPs enough for GCTA to work with?

Thanks in advance!

Jian Yang
Administrator

Posts: 362

Estimate the variance explained by common and rare variants Sept 30, 2015 22:56:00 GMT

Quote

Post by Jian Yang on Sept 30, 2015 22:56:00 GMT

In our recent publication, we demonstrated the use of the GREML-LDMS method to partitioning the genetic variance into the contributions from common and rare variants using whole genome sequencing data or data from 1000 Genome Project imputation. Please see the paper below
www.nature.com/ng/journal/v47/n10/full/ng.3390.html

A tutorial of the GREML-LDMS method can be found at gcta.freeforums.net/thread/194/gcta-ldms-estimating-heritability-data