### Post by anbai on Aug 6, 2023 1:10:19 GMT

Hi,

First, thank you for this wonderful software!

I have two questions: 1) SNP-based heritability, and 2) fastGWA using imputed genotype data.

Question 1: I am estimating the h2 of a quantitative trait by following this tutorial: yanglab.westlake.edu.cn/software/gcta/#GREMLinWGSorimputeddata

I followed the exact steps, except that I ran Step 1 per chromosome - because run all 22 autosomal chromosomes take too long. I then merge the output file together.

The program ran without error, but the results are unexpected: h2~= 0

I also merged the multiple GRMs into a single one as input displayed below: 'segment_based_ld_score_stratfied'

Here is the log:

Another question, I also want to run fastGWA by following this tutorial: yanglab.westlake.edu.cn/software/gcta/#fastGWA

So, for the full-dense GRM, I used the one above (segment_based_ld_score_stratfied) to generate the sparse GRM. Now the program is running for one day, and the step stays here (using 8 threads):

Any suggestions for my two questions?

Thanks

First, thank you for this wonderful software!

I have two questions: 1) SNP-based heritability, and 2) fastGWA using imputed genotype data.

Question 1: I am estimating the h2 of a quantitative trait by following this tutorial: yanglab.westlake.edu.cn/software/gcta/#GREMLinWGSorimputeddata

I followed the exact steps, except that I ran Step 1 per chromosome - because run all 22 autosomal chromosomes take too long. I then merge the output file together.

The program ran without error, but the results are unexpected: h2~= 0

I also merged the multiple GRMs into a single one as input displayed below: 'segment_based_ld_score_stratfied'

Here is the log:

*******************************************************************

* Genome-wide Complex Trait Analysis (GCTA)

* version 1.93.2 beta Linux

* (C) 2010-present, Jian Yang, The University of Queensland

* Please report bugs to Jian Yang <jian.yang.qt@gmail.com>

*******************************************************************

Analysis started at 17:40:43 EDT on Sat Aug 05 2023.

Hostname: ***

Accepted options:

--reml

--grm segment_based_ld_score_stratfied

--pheno pheno_normalized_residualized_h2.phen

--keep keep_for_gcta_h2.txt

--thread-num 8

--mpheno 1

--grm-adj 0

--out h2

Note: the program will be running on 8 threads.

Reading IDs of the GRM from [segment_based_ld_score_stratfied.grm.id].

33743 IDs read from [segment_based_ld_score_stratfied.grm.id].

Reading the GRM from [segment_based_ld_score_stratfied.grm.bin].

Reading the number of SNPs for the GRM from [segment_based_ld_score_stratfied.grm.N.bin].

GRM for 33743 individuals are included from [segment_based_ld_score_stratfied.grm.bin].

Reading phenotypes from [pheno_normalized_residualized_h2.phen].

Non-missing phenotypes of 25784 individuals are included from [pheno_normalized_residualized_h2.phen].

25784 individuals are kept from [keep_for_gcta_h2.txt].

Adjusting the GRM for sampling errors ...

25784 individuals are in common in these files.

Performing REML analysis ... (Note: may take hours depending on sample size).

25784 observations, 1 fixed effect(s), and 2 variance component(s)(including residual variance).

Calculating prior values of variance components by EM-REML ...

Updated prior values: 0.495765 0.971934

logL: -16812

Running AI-REML algorithm ...

Iter. logL V(G) V(e)

1 -13043.33 0.00000 0.63638 (1 component(s) constrained)

2 -14287.52 0.00000 0.70855 (1 component(s) constrained)

3 -13624.12 0.00000 0.91135 (1 component(s) constrained)

4 -12850.05 0.00000 0.93494 (1 component(s) constrained)

5 -12825.20 0.00000 0.95211 (1 component(s) constrained)

6 -12812.96 0.00000 0.96440 (1 component(s) constrained)

7 -12807.02 0.00000 0.97308 (1 component(s) constrained)

8 -12804.17 0.00000 0.97914 (1 component(s) constrained)

9 -12802.81 0.00000 0.98336 (1 component(s) constrained)

10 -12802.17 0.00000 0.98627 (1 component(s) constrained)

11 -12801.86 0.00000 0.99262 (1 component(s) constrained)

12 -12801.60 0.00000 0.99266 (1 component(s) constrained)

13 -12801.60 0.00000 0.99266 (1 component(s) constrained)

Log-likelihood ratio converged.

Calculating the logLikelihood for the reduced model ...

(variance component 1 is dropped from the model)

Calculating prior values of variance components by EM-REML ...

Updated prior values: 0.99266

logL: -12801.59520

Running AI-REML algorithm ...

Iter. logL V(e)

1 -12801.60 0.99266

Log-likelihood ratio converged.

Summary result of REML analysis:

Source Variance SE

V(G) 0.000001 0.000464

V(e) 0.992658 0.008755

Vp 0.992659 0.008743

V(G)/Vp 0.000001 0.000468

Sampling variance/covariance of the estimates of variance components:

2.155698e-07 -2.140670e-07

-2.140670e-07 7.664834e-05

* Genome-wide Complex Trait Analysis (GCTA)

* version 1.93.2 beta Linux

* (C) 2010-present, Jian Yang, The University of Queensland

* Please report bugs to Jian Yang <jian.yang.qt@gmail.com>

*******************************************************************

Analysis started at 17:40:43 EDT on Sat Aug 05 2023.

Hostname: ***

Accepted options:

--reml

--grm segment_based_ld_score_stratfied

--pheno pheno_normalized_residualized_h2.phen

--keep keep_for_gcta_h2.txt

--thread-num 8

--mpheno 1

--grm-adj 0

--out h2

Note: the program will be running on 8 threads.

Reading IDs of the GRM from [segment_based_ld_score_stratfied.grm.id].

33743 IDs read from [segment_based_ld_score_stratfied.grm.id].

Reading the GRM from [segment_based_ld_score_stratfied.grm.bin].

Reading the number of SNPs for the GRM from [segment_based_ld_score_stratfied.grm.N.bin].

GRM for 33743 individuals are included from [segment_based_ld_score_stratfied.grm.bin].

Reading phenotypes from [pheno_normalized_residualized_h2.phen].

Non-missing phenotypes of 25784 individuals are included from [pheno_normalized_residualized_h2.phen].

25784 individuals are kept from [keep_for_gcta_h2.txt].

Adjusting the GRM for sampling errors ...

25784 individuals are in common in these files.

Performing REML analysis ... (Note: may take hours depending on sample size).

25784 observations, 1 fixed effect(s), and 2 variance component(s)(including residual variance).

Calculating prior values of variance components by EM-REML ...

Updated prior values: 0.495765 0.971934

logL: -16812

Running AI-REML algorithm ...

Iter. logL V(G) V(e)

1 -13043.33 0.00000 0.63638 (1 component(s) constrained)

2 -14287.52 0.00000 0.70855 (1 component(s) constrained)

3 -13624.12 0.00000 0.91135 (1 component(s) constrained)

4 -12850.05 0.00000 0.93494 (1 component(s) constrained)

5 -12825.20 0.00000 0.95211 (1 component(s) constrained)

6 -12812.96 0.00000 0.96440 (1 component(s) constrained)

7 -12807.02 0.00000 0.97308 (1 component(s) constrained)

8 -12804.17 0.00000 0.97914 (1 component(s) constrained)

9 -12802.81 0.00000 0.98336 (1 component(s) constrained)

10 -12802.17 0.00000 0.98627 (1 component(s) constrained)

11 -12801.86 0.00000 0.99262 (1 component(s) constrained)

12 -12801.60 0.00000 0.99266 (1 component(s) constrained)

13 -12801.60 0.00000 0.99266 (1 component(s) constrained)

Log-likelihood ratio converged.

Calculating the logLikelihood for the reduced model ...

(variance component 1 is dropped from the model)

Calculating prior values of variance components by EM-REML ...

Updated prior values: 0.99266

logL: -12801.59520

Running AI-REML algorithm ...

Iter. logL V(e)

1 -12801.60 0.99266

Log-likelihood ratio converged.

Summary result of REML analysis:

Source Variance SE

V(G) 0.000001 0.000464

V(e) 0.992658 0.008755

Vp 0.992659 0.008743

V(G)/Vp 0.000001 0.000468

Sampling variance/covariance of the estimates of variance components:

2.155698e-07 -2.140670e-07

-2.140670e-07 7.664834e-05

Another question, I also want to run fastGWA by following this tutorial: yanglab.westlake.edu.cn/software/gcta/#fastGWA

So, for the full-dense GRM, I used the one above (segment_based_ld_score_stratfied) to generate the sparse GRM. Now the program is running for one day, and the step stays here (using 8 threads):

After matching all the files, 44875 individuals to be included in the analysis.

Estimating the genetic variance (Vg) by fastGWA-REML (grid search)...

Estimating the genetic variance (Vg) by fastGWA-REML (grid search)...

Any suggestions for my two questions?

Thanks