mikem
New Member
Posts: 2
|
Post by mikem on Nov 19, 2015 12:03:20 GMT
I am trying to test the limits of GCTA --mlma capacity by steadily increasing the number of samples I am analyzing with a linear mixed model. I am using a pre-computed GRM, and have tested the MLM analysis with 12k, 14k, 25k and 40k individuals. The compute time increases in non-linear time (probably to be expected). The GRM is re-computed for each new sample size.
However, when I attempt to run the MLM analysis with 50K samples I get a memory error: terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
I have generally found that this arises when the individuals in the GRM and input genotype/phenotype files are not perfectly matched for smaller samples sizes. I am running it on a compute node with >300GB of RAM, and the maximum observed mem usage is ~78G. There is one other process on that node which has reserved 60G virtual memory. Each run is using 20 CPUs.
I'm running the latest version of GCTA (v1.25.0). Whilst I expect there to be an increase in compute time. I had failure issues previously with duplicated IDs (v 1.24.7).
Is this a natural sample size limit for GCTA, or is there scope to expand this further? My aim is to be able to analyse ~150k individuals with this MLM. I have looked into using BOLT-LMM, however, in it's current incarnation it does not take a pre-computed GRM.
Thanks Mike
|
|
|
Post by Jian Yang on Dec 2, 2015 3:10:30 GMT
You might try to run this on individual chromosomes separately (e.g. start with chr22) to see if you get the same issue. It's likely to be a RAM issue, we are thinking of releasing a more RAM-efficient version later but it really depends on whether we have time to do so or not.
|
|
mikem
New Member
Posts: 2
|
Post by mikem on Dec 10, 2015 11:48:12 GMT
Thanks Jian, The GRM was generate from a subset of variants, approx 10k. I get the same issue when trying to perform a bivariate REML analysis using ~10K SNPs.
I certainly hope you decide to release a more memory efficient version for my own selfish reasons!
|
|