The variance-covariance matrix V is not positive definitive

The variance-covariance matrix V is not positive definitive Jan 9, 2019 20:12:58 GMT

Quote

Post by moldach on Jan 9, 2019 20:12:58 GMT

I'm running into this apparently common error and trying to understand/troubleshoot it.

I ran GREML first on two cohorts from a GWAS (caucasians and africans) separately but ran into issues of high SE (and p-value) because of the low sample size (this was done because a post-doc who requested the analysis wanted to compare these against LDHUB which calculates SNP heritability based on GWAS summary statistics for a specific ethnic group). I also wanted to combine these two and run GREML with ethnicity as a covariate.

I combined the binary files (.bed, .bim, .fam) of the two cohorts together with plink merge before using GCTA::GREML.

plink --bfile AA --bmerge EA --make-bed --out merged
plink --bfile EA --exclude merged.missnp --make-bed --out EA_tmp
plink --bfile AA --exclude merged.missnp --make-bed --out AA_tmp
plink --bfile EA_tmp --bmerge AA_tmp --make-bed --out merged

Next I continued with the commands from the documentation (including Step2 option#1): cnsgenomics.com/software/gcta/#GREMLinWGSorimputeddata until the "REML analysis with multiple GRMs" when I get "Error: the variance-covariance matrix V is not positive definite"

gcta64 --reml --mgrm multi_GRMs.txt --pheno phen.txt --out test
It's my understanding that if the variance-covariance V is negative definite this could be cause by a low sample number; however, this shouldn't be the case for positive definite. Furthermore, "for unrelated individuals and common SNPs, you will need at least 3160 unrelated samples to get a SE down to 0.1 (see Visscher et al. 2014 PLoS Genet). Once I combine Caucasian and Africans groups (2061 + 1131 = 3192) I should have sufficient power?

The advice here (gcta.freeforums.net/thread/191/gcta-reml-error) suggests that there may be something wrong with the GRMs and to check for any missing values in the GRM(s) with the following R function:

ReadGRMBin=function(prefix, AllN=F, size=4){
 sum_i=function(i){
 return((1+i)*i/2)
 }

  BinFileName=paste(prefix,".grm.bin",sep="")
  NFileName=paste(prefix,".grm.N.bin",sep="")
  IDFileName=paste(prefix,".grm.id",sep="")
  id = read.table(IDFileName)
  n=dim(id)[1]
  BinFile=file(BinFileName, "rb");
  grm=readBin(BinFile, n=n*(n+1)/2, what=numeric(0), size=size)
  NFile=file(NFileName, "rb");
  if(AllN==T){
    N=readBin(NFile, n=n*(n+1)/2, what=numeric(0), size=size)
  }
  else N=readBin(NFile, n=1, what=numeric(0), size=size)
  i=sapply(1:n, sum_i)
  return(list(diag=grm[i], off=grm[-i], id=id, N=N))
}

So in GCTA::GREML Step #3 we created multiple GRMs (e.g.):

gcta64 --bfile merged --extract snp_group1.txt --make-grm --out merged_group1
So I check for missing values in all the GRMs like this (e.g. for GRM 1):

grmb1 <- ReadGRMBin("merged_group1")

anyNA(grmb1$diag)

[1] FALSE

I assume I'm checking these GRMs correctly and since I get FALSE for each of the four GRMs these should be okay, right?

I tried two other suggestions as well. The --reml-no-constrain option (gcta.freeforums.net/thread/190/variance-covaraince-matrix-error), which gives the same error:

gcta64 --reml-no-constrain --mgrm multi_GRMs.txt --pheno merged.phen --out merged
and the --reml-bendV option (gcta.freeforums.net/thread/243/variance-covaraince-matrix-positive-definite) which seems to run (doesn't given any errors) but only produces the .log file and not the .hsq (results)

gcta64 --reml-bendV --mgrm multi_GRMs.txt --pheno merged.phen --out merged

I'm currently trying to get around the problem using the Fisher scoring approach at this step:

gcta64 --bfile merged --extract snp_group1.txt --make-grm-alg 1 --out alg1_group1
Any feedback is greatly appreciated!