|
Post by mckeller on Apr 14, 2016 22:54:59 GMT
Hi Jian,
We're trying to run some interaction analyses in the UK Biobank but it is failing with ~60K samples. We have plenty of RAM to do this (1TB). I think there are some vector size limits in C++ - maybe that's the issue? What is the max sample size we can run GCTA with? We could use BOLT-REML but it does not have the ability to estimate interaction effects. For now, we're splitting the sample into smaller subsamples of 40K. But even that large of a sample size leads to large SE's for some of the analyses we'd like to perform.
Matt
|
|
|
Post by Jian Yang on Apr 15, 2016 7:17:13 GMT
We have applied a 3-component GREML analysis to the UKB data. It works fine. What kind of error message did you get?
|
|
|
Post by agwills on Apr 15, 2016 22:48:00 GMT
Hi Jian,
I am working with Matt on this issue, and it seems like this is a memory allocation problem on our side of things. We have hopefully fixed this (as GCTA appears to now be running smoothly), but will let you know if we do get any errors or problems.
Thank you, Amanda
|
|
|
Post by agwills on May 5, 2016 23:01:16 GMT
Hi Jian,
Univariate analyses have been working fine using 100k+ sample sizes, however, I am getting the following error message when running bivariate reml analysis:
"line 12: 75730 Segmentation fault"
I was able to run the bivariate analysis successfully on about 50k individuals, but the analysis has repeatedly failed when I upped the sample to about 70k individuals. From the log, the program seems to fail right after reporting the number of cases and controls for each trait.
I used the following options: --grm --reml-bivar 2 3 --keep --pheno --qcovar --covar --reml-bivar-prevalence --thread-num --out
Any help with figuring this out would be much appreciated!
Amanda
|
|
|
Post by ukucam on Aug 29, 2017 8:09:25 GMT
Hi Jian, Univariate analyses have been working fine using 100k+ sample sizes, however, I am getting the following error message when running bivariate reml analysis: "line 12: 75730 Segmentation fault" I was able to run the bivariate analysis successfully on about 50k individuals, but the analysis has repeatedly failed when I upped the sample to about 70k individuals. From the log, the program seems to fail right after reporting the number of cases and controls for each trait. I used the following options: --grm --reml-bivar 2 3 --keep --pheno --qcovar --covar --reml-bivar-prevalence --thread-num --out Any help with figuring this out would be much appreciated! Amanda Dear Amanda, Did you manage to find a solution to this problem?
|
|
|
Post by Jian Yang on Sept 1, 2017 1:32:50 GMT
The new version should work with n > 100K.
|
|
|
Post by ukucam on Sept 10, 2017 9:25:58 GMT
The new version should work with n > 100K. This sounds excellent! I have been attempting to run the new version with sample size ~40k & ~96k SNPs and the memory requirement seems to be high. Is there anyway to estimate an upper bound on the memory requirement for a bivariate REML run?
|
|