Error: the information matrix is not invertible

swp
New Member

Posts: 2

Error: the information matrix is not invertible Apr 20, 2015 19:27:44 GMT

Quote

Post by swp on Apr 20, 2015 19:27:44 GMT

I know this has been addressed in previous threads and the typical explanation is sample size. However, I'm considering nearly 10,000 samples...seems like something other than sample size might be causing this. Any suggestions for how to address this?

Zhihong Zhu Moderator Posts: 88	Error: the information matrix is not invertible Apr 26, 2015 12:03:42 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Zhihong Zhu on Apr 26, 2015 12:03:42 GMT Hi there, Did you perform any QC on the dataset? One possible reason is that the true signal might be removed by the QC. How about the estimate without QC? Cheers, Zhihong

swp New Member Posts: 2	Error: the information matrix is not invertible Apr 27, 2015 18:01:05 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by swp on Apr 27, 2015 18:01:05 GMT Yes...I filtered out SNPs with maf<1%, but that's it. The same dataset has been used in GWAS before and significant hits are detectable. Not sure that's the problem. Any other ideas?

Zhihong Zhu
Moderator

Posts: 88

Error: the information matrix is not invertible Apr 30, 2015 2:00:10 GMT

Quote

Post by Zhihong Zhu on Apr 30, 2015 2:00:10 GMT

Hi there,

It is possible that some SNPs were genome-wide significant but the total SNP heritability is very small, nearly zero. Maybe trying estimating SNP heritability chromosome by chromosome to find out what's going on in each chromosome.

Cheers,
Zhihong

jaleal
New Member

Posts: 5

Error: the information matrix is not invertible Jun 4, 2015 0:12:00 GMT

Quote

Post by jaleal on Jun 4, 2015 0:12:00 GMT

I am trying to estimate additive and dominance variance for some SNPs. I can fit each grm separately, but when fitting together I get the "Error: the information matrix is not invertible." I tried switching to the EM algorithm with --reml-alg 2, but it failed to converge after 200 iterations and was quite slow in getting there.

The sample size is 6000 individuals with 1070 SNPS.

I realize that the dominance variance is low. Is that the only reason? because I have many other replicates with equivalent or lower dominance variances that went through no problem. This error occurs about in about 1/10000 simulated replicates.

I can send you the data as examples if there is an easy way to do that.

GCTA.add.hsq
Source Variance SE
V(G) 0.000460 0.000048
V(e) 0.002211 0.000042
Vp 0.002670 0.000060
V(G)/Vp 0.172143 0.015372
logL 15076.009
logL0 14893.753
LRT 364.513
df 1
Pval 0
n 6000

GCTA.domi.hsq
Source Variance SE
V(G) 0.000014 0.000010
V(e) 0.002553 0.000047
Vp 0.002567 0.000047
V(G)/Vp 0.005442 0.003768
logL 14894.016
logL0 14893.753
LRT 0.526
df 1
Pval 0.2342
n 6000

Zhihong Zhu
Moderator

Posts: 88

Error: the information matrix is not invertible Jun 4, 2015 7:29:27 GMT

Quote

Post by Zhihong Zhu on Jun 4, 2015 7:29:27 GMT

Hi Jaleal,

Yes, I think it could be because dominance is low. I estimated hA (additive heritability at all SNPs) and hD (dominance heritability at all SNPs) for 79 traits, where on average hD/hA = 0.2, and var(hD) / var(hA)=2. Now your hA is almost equal to zero, it is possible that dominance GRM is not invertible. (Please refer details to www.cell.com/ajhg/abstract/S0002-9297(15)00009-9)

Cheers,
Zhihong

jaleal
New Member

Posts: 5

Error: the information matrix is not invertible Jun 4, 2015 14:58:58 GMT

Quote

Post by jaleal on Jun 4, 2015 14:58:58 GMT

Hey Zhihong,

Okay, but thousands of my runs have those characteristics and do not fail. This example is not even close to having the smallest values. For example, this run was successful. All of my hA are on the order 0.05-0.25.

Source Variance SE
V(G1) 0.000236 0.000031
V(G2) -0.000005 0.000004
V(e) 0.002140 0.000041
Vp 0.002370 0.000048
V(G1)/Vp 0.099412 0.012285
V(G2)/Vp -0.002312 0.001730

Am I misinterpreting the meaning of orthogonality in this context? I took it to mean that the estimate of variance associated with a grm is independent any other grm's fit. I expected I could run the following code and get the same estimates of V(A) and V(D) whether I did two separate calls to --reml --grm or one call to --reml --mgrm. Indeed for many examples this appears to be true, within some error. However, in my initial example I can successfully run two separate calls to --grm, but it fails with --mgrm. If the dominance grm is not invertible, why can be it be successfully fit with --reml --grm?

Thanks,
Jaleal

Code:
gcta64 --bfile data --autosome --make-grm --out test_add

##Do the domimance thing
##Generates test_domi.*.grm.*
gcta64 --bfile data --autosome --make-grm-d --out test_domi

#make the file
echo "test_add" > add_domi_grm. txt
echo "test_domi.d" >> add_domi_grm. txt

#pheno file
cut -d" " -f 1,2,6 data.ped > test.phen

#fit the models
gcta64 --reml --grm test_add. --pheno test.phen --out GCTA.add --reml-no-constrain --reml-alg 1 --reml-maxit 200
gcta64 --reml --grm test_domi.d --pheno test.phen --out GCTA.domi --reml-no-constrain --reml-alg 1 --reml-maxit 200
gcta64 --reml --mgrm add_domi_grm.txt --pheno test.phen --out GCTAoutput --reml-no-constrain --reml-alg 1 --reml-maxit 200

Last Edit: Jun 4, 2015 15:41:49 GMT by jaleal

Zhihong Zhu
Moderator

Posts: 88

Error: the information matrix is not invertible Jun 5, 2015 5:33:18 GMT

Quote

Post by Zhihong Zhu on Jun 5, 2015 5:33:18 GMT

Hi Jaleal

No, I don't think you misinterpret the meaning of orthogonality and your conclusion (hD is tiny) is right. But I think there may be something unusual in your dataset.

1. That hD is independent from hA doesn't mean the slope is exactly 0 if I plot hA vs hD, e.g. Figure S2 in that paper. So hD from joint analysis (fitting both GRM_A and GRM_D) could be slightly different from hD from separate analysis (fitting GRM_D only). And hD is ~0 when you fit GRM_D only, i.e. it is likely that the hD would be negative when you fit both GRM_A and GRM_D

2. The GRM is not invertible, suggesting the hD is largely < 0. Although basically we can estimate the hD, it is possible that GRM_D is not invertible when sample size is not large, because var(hD) is large. In the paper, I compared the SNP heritability (i.e. both hA and hD) estimated from Hapmap3 SNPs with the ones estimated from genotyped SNPs in ARIC (figure was not shown in the paper). In general, hA (or hD) estimated from Hapmap3 SNPs is consistent with the ones estimated from genotyped SNPs. Whereas variance of hD from two SNP sets > variance of hA from two SNP sets. So hD is more sensitive to the quality of the data. But I don't think it could change the conclusion.

Go back to your results, I'm sorry I didn't read your results carefully last time. The SEs for both hA and hD are too large. Too my experience, the SE for hA would be ~0.05, for hD would be ~0.07, since sample size of ARIC data in the paper is ~6600 (Table S3 in the paper). I'm not sure why the SE is so large, maybe because you didn't remove related samples or include rare SNPs. Please check your dataset.

Cheers,
Zhihong

jaleal New Member Posts: 5	Error: the information matrix is not invertible Jun 5, 2015 16:20:29 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by jaleal on Jun 5, 2015 16:20:29 GMT Thanks Zhihong, Getting rid of rare variants (MAF<0.01) seemed to solve this issue. -Jaleal

Error: the information matrix is not invertible

Post by swp on Apr 20, 2015 19:27:44 GMT

Post by Zhihong Zhu on Apr 26, 2015 12:03:42 GMT

Post by swp on Apr 27, 2015 18:01:05 GMT

Post by Zhihong Zhu on Apr 30, 2015 2:00:10 GMT

Post by jaleal on Jun 4, 2015 0:12:00 GMT

Post by Zhihong Zhu on Jun 4, 2015 7:29:27 GMT

Post by jaleal on Jun 4, 2015 14:58:58 GMT

Post by Zhihong Zhu on Jun 5, 2015 5:33:18 GMT

Post by jaleal on Jun 5, 2015 16:20:29 GMT