I was hoping to use fastGWA on biobank data on around 70k individuals. To create a GRM using --make-grm it would take 11 days to generate, so to cut down the time I wanted to use --make-grm-part. However, it's not clear to me how this flag affects the GRM calculation. Is the GRM still calculated pairwise across the entire sample or only pairwise within the part? For example, if I use 70 parts will the 1000 people in each part be compared to the other 69k or just the other 999 individuals in the same part? thanks!
It still compares to other 69k individuals in your case.
Based on this question, I tried to convert grm.gz in plain text file for each part GRM. But it seems only 999*(999-1)/2 rows are output which means only output the GRM within each part instead of all other 69K individuals. I am confused about this point. Another question is if the grm.N.bin file only includes a number, why does the file size of test.grm.N.bin be the same with test.grm.bin?