Computational requirements GRM calculation | Complex Trait Genetics Forum

Computational requirements GRM calculation Oct 24, 2014 16:49:44 GMT

Quote

Post by wouter on Oct 24, 2014 16:49:44 GMT

Hi,

I have 1000 Genomes phase 3 imputed data for +/- 40K individuals (10-12M SNPs) in 5MB/500individuals chunks of IMPUTE2 dosage format and would like to use GCTA and make a GRM. Before I start and merge all chunks of imputed data to mach format I would like to know whether it's feasible to run such a calculation. Can you give me an estimate on the required memory and CPU hours?

I have the opportunity to use our HPC and parallelise the calculation using up to > 1000 nodes (15GB memory per node) at a time via a submit host. Are there ways to parallelise the calculation other than using the --thread-num option (i.e. split the 40K individuals in smaller chunks)?

Thanks a lot!

Wouter