Post by Caity Anderson on Feb 12, 2014 16:42:29 GMT
I’ve read several papers that suggest calculating the GRM using the entire genome with the exception of the region of interest. Prior to this, I was calculating the GRM using the SNPS from a defined subset region of a particular chromosome, and using that same subset in my REML analysis. When I attempted to use the GRM calculated from the entire genome and then run the REML analysis on the same subset as before, the calculation took virtually no time at all and produced an output that suggested the test was not functioning the way I had intended (see below). How do you suggest running REML on just a subset, while using a GRM made from all genotype data in that analysis?
Thanks!
-Caity Anderson
[caitlin@login freeze2_vcf]$ ~/src/gcta_1.24/gcta64 --grm-bin freeze2chrandsnpidsnox.GRM --pheno ../phenotypedata_sort.phen --extract 12190k_12260ksubset.list --reml --out freeze212190k_12260ksubsetfromfullGRM.REML --thread-num 1
*******************************************************************
* Genome-wide Complex Trait Analysis (GCTA)
* version 1.24
* (C) 2010-2013 Jian Yang, Hong Lee, Michael Goddard and Peter Visscher
* The University of Queensland
* MIT License
*******************************************************************
Analysis started: Fri Feb 7 16:36:08 2014
Options:
--grm-bin freeze2chrandsnpidsnox.GRM
--pheno ../phenotypedata_sort.phen
--extract 12190k_12260ksubset.list
--reml
--out freeze212190k_12260ksubsetfromfullGRM.REML
--thread-num 1
Note: This is a multi-thread program. You could specify the number of threads by the --thread-num option to speed up the computation if there are multiple processors in your machine.
Reading IDs of the GRM from [freeze2chrandsnpidsnox.GRM.grm.id].
205 IDs read
Reading the GRM from [freeze2chrandsnpidsnox.GRM.grm.bin].
Reading the number of SNPs for the GRM from [freeze2chrandsnpidsnox.GRM.grm.N.bin].
Pairwise genetic relationships between 205 individuals are included from [freeze2chrandsnpidsnox.GRM.grm.bin].
Reading phenotypes from [../phenotypedata_sort.phen].
Non-missing phenotypes of 173 individuals are included from [../phenotypedata_sort.phen].
169 individuals are in common in these files.
Performing REML analysis ... (Note: may take hours depending on sample size).
169 observations, 1 fixed effect(s), and 2 variance component(s)(including residual variance).
Calculating prior values of variance components by EM-REML ...
Updated prior values: 48.7878 51.1137
logL: -484.903
Running AI-REML algorithm ...
Iter. logL V(G) V(e)
1 -481.35 68.89575 0.00012 (1 component(s) constrained)
2 -477.18 60.79832 0.00012 (1 component(s) constrained)
3 -477.15 56.75049 0.00012 (1 component(s) constrained)
4 -477.71 49.36812 0.00012 (1 component(s) constrained)
5 -480.28 50.20353 0.00012 (1 component(s) constrained)
6 -479.86 50.21790 0.00012 (1 component(s) constrained)
7 -479.85 50.21791 0.00012 (1 component(s) constrained)
8 -479.85 50.21791 0.00012 (1 component(s) constrained)
Log-likelihood ratio converged.
Calculating the logLikelihood for the reduced model ...
(variance component 1 is dropped from the model)
Calculating prior values of variance components by EM-REML ...
Updated prior values: 118.24836
logL: -487.47907
Running AI-REML algorithm ...
Iter. logL V(e)
1 -487.48 118.24836
Log-likelihood ratio converged.
Summary result of REML analysis:
Source Variance SE
V(G) 50.217906 13.366741
V(e) 0.000118 20.330764
Vp 50.218024 9.236387
V(G)/Vp 0.999998 0.404850
Variance/Covariance Matrix of the estimates:
178.67
-253.349 413.34
Summary result of REML analysis has been saved in the file [freeze212190k_12260ksubsetfromfullGRM.REML.hsq].
Analysis finished: Fri Feb 7 16:36:08 2014
Computational time: 0:0:0