Post by anbai on Aug 6, 2023 1:10:19 GMT
Hi,
First, thank you for this wonderful software!
I have two questions: 1) SNP-based heritability, and 2) fastGWA using imputed genotype data.
Question 1: I am estimating the h2 of a quantitative trait by following this tutorial: yanglab.westlake.edu.cn/software/gcta/#GREMLinWGSorimputeddata
I followed the exact steps, except that I ran Step 1 per chromosome - because run all 22 autosomal chromosomes take too long. I then merge the output file together.
The program ran without error, but the results are unexpected: h2~= 0
I also merged the multiple GRMs into a single one as input displayed below: 'segment_based_ld_score_stratfied'
Here is the log:
Another question, I also want to run fastGWA by following this tutorial: yanglab.westlake.edu.cn/software/gcta/#fastGWA
So, for the full-dense GRM, I used the one above (segment_based_ld_score_stratfied) to generate the sparse GRM. Now the program is running for one day, and the step stays here (using 8 threads):
Any suggestions for my two questions?
Thanks
First, thank you for this wonderful software!
I have two questions: 1) SNP-based heritability, and 2) fastGWA using imputed genotype data.
Question 1: I am estimating the h2 of a quantitative trait by following this tutorial: yanglab.westlake.edu.cn/software/gcta/#GREMLinWGSorimputeddata
I followed the exact steps, except that I ran Step 1 per chromosome - because run all 22 autosomal chromosomes take too long. I then merge the output file together.
The program ran without error, but the results are unexpected: h2~= 0
I also merged the multiple GRMs into a single one as input displayed below: 'segment_based_ld_score_stratfied'
Here is the log:
*******************************************************************
* Genome-wide Complex Trait Analysis (GCTA)
* version 1.93.2 beta Linux
* (C) 2010-present, Jian Yang, The University of Queensland
* Please report bugs to Jian Yang <jian.yang.qt@gmail.com>
*******************************************************************
Analysis started at 17:40:43 EDT on Sat Aug 05 2023.
Hostname: ***
Accepted options:
--reml
--grm segment_based_ld_score_stratfied
--pheno pheno_normalized_residualized_h2.phen
--keep keep_for_gcta_h2.txt
--thread-num 8
--mpheno 1
--grm-adj 0
--out h2
Note: the program will be running on 8 threads.
Reading IDs of the GRM from [segment_based_ld_score_stratfied.grm.id].
33743 IDs read from [segment_based_ld_score_stratfied.grm.id].
Reading the GRM from [segment_based_ld_score_stratfied.grm.bin].
Reading the number of SNPs for the GRM from [segment_based_ld_score_stratfied.grm.N.bin].
GRM for 33743 individuals are included from [segment_based_ld_score_stratfied.grm.bin].
Reading phenotypes from [pheno_normalized_residualized_h2.phen].
Non-missing phenotypes of 25784 individuals are included from [pheno_normalized_residualized_h2.phen].
25784 individuals are kept from [keep_for_gcta_h2.txt].
Adjusting the GRM for sampling errors ...
25784 individuals are in common in these files.
Performing REML analysis ... (Note: may take hours depending on sample size).
25784 observations, 1 fixed effect(s), and 2 variance component(s)(including residual variance).
Calculating prior values of variance components by EM-REML ...
Updated prior values: 0.495765 0.971934
logL: -16812
Running AI-REML algorithm ...
Iter. logL V(G) V(e)
1 -13043.33 0.00000 0.63638 (1 component(s) constrained)
2 -14287.52 0.00000 0.70855 (1 component(s) constrained)
3 -13624.12 0.00000 0.91135 (1 component(s) constrained)
4 -12850.05 0.00000 0.93494 (1 component(s) constrained)
5 -12825.20 0.00000 0.95211 (1 component(s) constrained)
6 -12812.96 0.00000 0.96440 (1 component(s) constrained)
7 -12807.02 0.00000 0.97308 (1 component(s) constrained)
8 -12804.17 0.00000 0.97914 (1 component(s) constrained)
9 -12802.81 0.00000 0.98336 (1 component(s) constrained)
10 -12802.17 0.00000 0.98627 (1 component(s) constrained)
11 -12801.86 0.00000 0.99262 (1 component(s) constrained)
12 -12801.60 0.00000 0.99266 (1 component(s) constrained)
13 -12801.60 0.00000 0.99266 (1 component(s) constrained)
Log-likelihood ratio converged.
Calculating the logLikelihood for the reduced model ...
(variance component 1 is dropped from the model)
Calculating prior values of variance components by EM-REML ...
Updated prior values: 0.99266
logL: -12801.59520
Running AI-REML algorithm ...
Iter. logL V(e)
1 -12801.60 0.99266
Log-likelihood ratio converged.
Summary result of REML analysis:
Source Variance SE
V(G) 0.000001 0.000464
V(e) 0.992658 0.008755
Vp 0.992659 0.008743
V(G)/Vp 0.000001 0.000468
Sampling variance/covariance of the estimates of variance components:
2.155698e-07 -2.140670e-07
-2.140670e-07 7.664834e-05
* Genome-wide Complex Trait Analysis (GCTA)
* version 1.93.2 beta Linux
* (C) 2010-present, Jian Yang, The University of Queensland
* Please report bugs to Jian Yang <jian.yang.qt@gmail.com>
*******************************************************************
Analysis started at 17:40:43 EDT on Sat Aug 05 2023.
Hostname: ***
Accepted options:
--reml
--grm segment_based_ld_score_stratfied
--pheno pheno_normalized_residualized_h2.phen
--keep keep_for_gcta_h2.txt
--thread-num 8
--mpheno 1
--grm-adj 0
--out h2
Note: the program will be running on 8 threads.
Reading IDs of the GRM from [segment_based_ld_score_stratfied.grm.id].
33743 IDs read from [segment_based_ld_score_stratfied.grm.id].
Reading the GRM from [segment_based_ld_score_stratfied.grm.bin].
Reading the number of SNPs for the GRM from [segment_based_ld_score_stratfied.grm.N.bin].
GRM for 33743 individuals are included from [segment_based_ld_score_stratfied.grm.bin].
Reading phenotypes from [pheno_normalized_residualized_h2.phen].
Non-missing phenotypes of 25784 individuals are included from [pheno_normalized_residualized_h2.phen].
25784 individuals are kept from [keep_for_gcta_h2.txt].
Adjusting the GRM for sampling errors ...
25784 individuals are in common in these files.
Performing REML analysis ... (Note: may take hours depending on sample size).
25784 observations, 1 fixed effect(s), and 2 variance component(s)(including residual variance).
Calculating prior values of variance components by EM-REML ...
Updated prior values: 0.495765 0.971934
logL: -16812
Running AI-REML algorithm ...
Iter. logL V(G) V(e)
1 -13043.33 0.00000 0.63638 (1 component(s) constrained)
2 -14287.52 0.00000 0.70855 (1 component(s) constrained)
3 -13624.12 0.00000 0.91135 (1 component(s) constrained)
4 -12850.05 0.00000 0.93494 (1 component(s) constrained)
5 -12825.20 0.00000 0.95211 (1 component(s) constrained)
6 -12812.96 0.00000 0.96440 (1 component(s) constrained)
7 -12807.02 0.00000 0.97308 (1 component(s) constrained)
8 -12804.17 0.00000 0.97914 (1 component(s) constrained)
9 -12802.81 0.00000 0.98336 (1 component(s) constrained)
10 -12802.17 0.00000 0.98627 (1 component(s) constrained)
11 -12801.86 0.00000 0.99262 (1 component(s) constrained)
12 -12801.60 0.00000 0.99266 (1 component(s) constrained)
13 -12801.60 0.00000 0.99266 (1 component(s) constrained)
Log-likelihood ratio converged.
Calculating the logLikelihood for the reduced model ...
(variance component 1 is dropped from the model)
Calculating prior values of variance components by EM-REML ...
Updated prior values: 0.99266
logL: -12801.59520
Running AI-REML algorithm ...
Iter. logL V(e)
1 -12801.60 0.99266
Log-likelihood ratio converged.
Summary result of REML analysis:
Source Variance SE
V(G) 0.000001 0.000464
V(e) 0.992658 0.008755
Vp 0.992659 0.008743
V(G)/Vp 0.000001 0.000468
Sampling variance/covariance of the estimates of variance components:
2.155698e-07 -2.140670e-07
-2.140670e-07 7.664834e-05
Another question, I also want to run fastGWA by following this tutorial: yanglab.westlake.edu.cn/software/gcta/#fastGWA
So, for the full-dense GRM, I used the one above (segment_based_ld_score_stratfied) to generate the sparse GRM. Now the program is running for one day, and the step stays here (using 8 threads):
After matching all the files, 44875 individuals to be included in the analysis.
Estimating the genetic variance (Vg) by fastGWA-REML (grid search)...
Estimating the genetic variance (Vg) by fastGWA-REML (grid search)...
Any suggestions for my two questions?
Thanks