the number of SNPs for make-GRM

Jaden
Guest

the number of SNPs for make-GRM Apr 30, 2014 17:42:41 GMT

Quote

Post by Jaden on Apr 30, 2014 17:42:41 GMT

Hi, I have about 3 million imputed genome-wide SNPs and I need to estimate the genetic relationship using GCTA. I think I shouldn't use all the snps. That would be too many. Should I use the snps genotyped with good quality score such as R2>0.95 or something? How many snps I should include in order to get a good estimate of genetic relationship. Also, if I have about 1 million SNPs imputed data, how many memory and how many threads I need to use to maximize the computing efficiency and shorten the computing time while estimating genetic relationship matrix? Thank you for your help. Have a good day.

Zhihong Zhu
Moderator

Posts: 88

the number of SNPs for make-GRM May 1, 2014 11:24:45 GMT

Quote

Post by Zhihong Zhu on May 1, 2014 11:24:45 GMT

Hi,

Yes, the SNPs need to be QCed before doing the analysis. Usually, I just included the Hapmap3 SNPs with MAF > 0.01, p value of HWE > 1e-6, and imputation R^2 > 0.6. The memory used in the computation is also related to the sample size. If the sample size is ~6k, GRM can be generated chromosome by chromosome within 1 hours, using 20/30 threads, ~20*22 G memory (for each chromosome).

Cheers,
Zhihong

Jaden
Guest

the number of SNPs for make-GRM May 1, 2014 16:13:35 GMT

Quote

Post by Jaden on May 1, 2014 16:13:35 GMT

Thanks for your answer. I got an error when I ran make-grm for a data with 8400 samples and about 800000 SNPs.

The error is like this:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Is this because I did not provide enough ram or thread to run? I requested 10 threads and 40G. It used about 23G to read in the files and then it crashed.

Thanks again.

chrchang New Member Posts: 10	the number of SNPs for make-GRM May 2, 2014 11:16:33 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by chrchang on May 2, 2014 11:16:33 GMT As a temporary measure, you can use PLINK 1.9's --make-grm, which is much more memory-efficient--it should be able to handle 8400 samples x 800k SNPs even when restricted to 2 GB RAM.

Zhihong Zhu
Moderator

Posts: 88

the number of SNPs for make-GRM May 2, 2014 13:43:25 GMT

Quote

Post by Zhihong Zhu on May 2, 2014 13:43:25 GMT

Yes, memory is insufficient. Generating GRM (by GCTA) chromosome by chromosome will be more efficient, and less memory is needed, ~20G for each chromosome. In terms of the whole genome, > 150G memory are required.

Only a little memory is required by PLINK2, But I'm not sure how long it will take to do that. My guess is that GCTA may be running faster than PLINK2, because I think GCTA put every matrix in memory, while PLINK2 may spend some time on I/O.

Jaden Guest	the number of SNPs for make-GRM May 2, 2014 17:06:59 GMT Quote Select Post Deselect Post Link to Post Back to Top Post by Jaden on May 2, 2014 17:06:59 GMT Thanks for the answer. So --make-grm probably is more computationally intensive. But if I used other programs such as PLINK2 to estimate genetic relationship, can it be read into GCTA and used for next step? Thank you

chrchang
New Member

Posts: 10

the number of SNPs for make-GRM May 2, 2014 17:44:12 GMT

Quote

Post by chrchang on May 2, 2014 17:44:12 GMT

May 2, 2014 17:06:59 GMT Jaden said:

Thanks for the answer. So --make-grm probably is more computationally intensive. But if I used other programs such as PLINK2 to estimate genetic relationship, can it be read into GCTA and used for next step? Thank you

Yes, the files generated by PLINK2 --make-grm/--make-grm-gz are compatible with GCTA.

Jaden
Guest

the number of SNPs for make-GRM May 5, 2014 19:57:03 GMT

Quote

Post by Jaden on May 5, 2014 19:57:03 GMT

"Yes, memory is insufficient. Generating GRM (by GCTA) chromosome by chromosome will be more efficient, and less memory is needed, ~20G for each chromosome."-Will this create separate genetic relationship matrix for each chr? how do you combine them at the end for other use then? Thank you.

Jian Yang Administrator Posts: 362	the number of SNPs for make-GRM May 6, 2014 1:15:52 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Jian Yang on May 6, 2014 1:15:52 GMT You can use --mgrm followed by the --make-grm option to merge the GRMs into a single GRM.

Post by Jaden on Apr 30, 2014 17:42:41 GMT

Post by Zhihong Zhu on May 1, 2014 11:24:45 GMT

Post by Jaden on May 1, 2014 16:13:35 GMT

Post by chrchang on May 2, 2014 11:16:33 GMT

Post by Zhihong Zhu on May 2, 2014 13:43:25 GMT

Post by Jaden on May 2, 2014 17:06:59 GMT

Post by chrchang on May 2, 2014 17:44:12 GMT

Post by Jaden on May 5, 2014 19:57:03 GMT

Post by Jian Yang on May 6, 2014 1:15:52 GMT