|
Post by Jian Yang on Oct 16, 2013 23:39:25 GMT
I fixed a bug in the bivariate analysis including covariates. I re-wrote the code for the option --dosage-mach. I hope the long time existing platform specific memory leaking issue has been fixed. I added a new option to the mixed linear model association analysis, and changed the syntax. www.complextraitgenomics.com/software/gcta/download.html
|
|
|
Post by roberto on Oct 17, 2013 19:23:57 GMT
Hi Jian,
I just tried the new 1.21 version's bivariate analysis, and I keep getting the following error:
Error: the X^t * V^-1 * X matrix is not invertible. Please check the covariate(s) and/or the environmental factor(s).
And this is the command I used:
gcta121/gcta64 --grm-bin grm-all --keep keep.txt --reml-bivar-lrt-rg 1 --out hsq-all/biv --reml-bivar --pheno tmp_pheno --qcovar tmp_qcovar --covar tmp_covar
tmp_pheno looks like this: 000000001274 000000001274 -0.167719244 154 000000022453 000000022453 -0.093762318 172 000000075717 000000075717 -0.097659826 171 000000106601 000000106601 -0.070350434 162 000000106871 000000106871 -0.12887918 161 etc.
tmp_qcovar like this: 000000001274 000000001274 5487 0.0279846 -0.00703679 -0.00341286 -0.000282325 0.00663578 0.0366256 -0.00250858 -0.0165633 -0.0171531 0.0473404 000000022453 000000022453 5342.72808 0.0196633 0.022664 -0.0186467 0.00536748 0.00679019 -0.00224945 -0.0123797 0.00953261 -0.00987542 -0.0149596 000000075717 000000075717 5395 -0.0454948 -0.0129677 -0.00140345 0.0449091 0.0099883 -0.0149986 -0.0206333 -0.0265092 0.0193177 0.0167608 000000106871 000000106871 5720 0.0152528 0.0250063 -0.000616388 0.0174142 -0.0204757 0.0231832 0.00557928 -0.00476903 0.016817 -0.000322666 000000112288 000000112288 5262 0.00355057 -0.00964048 -0.0303913 0.0153188 -0.0240308 0.0419977 -0.0206399 -0.0332632 0.0252674 0.00192255 etc.
tmp_covar like this: 000000001274 000000001274 NOTTINGHAM Female 000000022453 000000022453 DRESDEN Male 000000075717 000000075717 NOTTINGHAM Male 000000106601 000000106601 PARIS Male 000000106871 000000106871 DRESDEN Female etc.
and the log (before the error):
Note: This is a multi-thread program. You could specify the number of threads by the --thread-num option to speed up the computation if there are multiple processors in your machine.
Reading IDs of the GRM from [grm-all.grm.id]. 2087 IDs read from [grm-all.grm.id]. Reading the GRM from [grm-all.grm.bin]. Reading the number of SNPs for the GRM from [grm-all.grm.N.bin]. Pairwise genetic relationships between 2087 individuals are included from [grm-all.grm.bin]. Reading phenotypes from [tmp_pheno]. There are 2 traits specified in the file [tmp_pheno]. Traits 1 and 2 are included in the bivariate analysis. Nonmissing phenotypes of 1959 individuals are included from [tmp_pheno]. Reading quantative covariates from [tmp_qcovar]. 11 quantative covarites of 1802 individuals read from [tmp_qcovar]. Reading discrete covariate(s) from [tmp_covar]. 2 discrete covariate(s) of 2089 individuals are included from [tmp_covar]. 1658 individuals are kept from [keep.txt]. 1658 individuals are in common in these files. 1658 non-missing phenotypes for trait #1 and 1658 for trait #2 11 quantitative variable(s) included as covariate(s). 2 discrete variable(s) included as covariate(s).
Performing bivariate REML analysis ... (Note: may take hours depending on sample size). 3316 observations, 40 fixed effect(s), and 6 variance component(s)(including residual variance). Calculating prior values of variance components by EM-REML ...
Analysis finished: Thu Oct 17 20:39:03 2013 Computational time: 0:0:11
thank you! roberto
|
|
|
Post by Jian Yang on Oct 21, 2013 0:16:50 GMT
if the covariates are correlated, there could be a co-linearity issue where a covariate is a linear function of others. I would suggest you check if you can do a multiple regression analysis of the trait on all these covariates in R - a simple way of checking the co-linearity problem.
|
|
|
Post by roberto on Oct 28, 2013 14:40:16 GMT
Dear Jian,
I found the problem! -- it was not the variables, but the scale of the variables. In fact, the problem came from my 'age' variable which is coded in days (values are of the order 5000 days in my cohort). I just divided these values by 365 (to get age in years), and now everything works fine.
thank you for your help!
roberto
|
|
|
Post by modi2020 on Feb 10, 2014 17:48:49 GMT
Hi Jian, I have a question on the format in which i should specify the bivariate analysis in GCTA. I have a binary and a continuous phenotype that I want to fit. In the webpage www.complextraitgenomics.com/software/gcta/reml_bivar.htmlIt says that we can fit either two continuous, two binary or a continuous and a binary bivariate model. It then explains how to fit two phenotypes using --reml-bivar 1 2. It also says how to fit two binary phenotypes and specify the prevalence of both using --reml-bivar-prevalence 0.1 0.05 I want to only specify the prevalence for the binary trait in my bivariate model. What would be the syntax for doing that? For example, if the binary response was the first one and the continuous is the second in my phenotype file Can I just say --reml-bivar 1 2 --reml-bivar-prevalence 0.1 Thank you I fixed a bug in the bivariate analysis including covariates. I re-wrote the code for the option --dosage-mach. I hope the long time existing platform specific memory leaking issue has been fixed. I added a new option to the mixed linear model association analysis, and changed the syntax. www.complextraitgenomics.com/software/gcta/download.html
|
|
|
Post by Jian Yang on Feb 20, 2014 8:35:20 GMT
In a bivariate analysis of a quantitative trait and a case-control study, I haven't considered in GCTA to transform the estimate of h2 on the observed scale to that on the underlying scale. So, in this case, the option --prevalence will not be working. I will consider that in the future.
|
|
|
Post by modi2020 on Mar 15, 2014 17:05:30 GMT
In a bivariate analysis of a quantitative trait and a case-control study, I haven't considered in GCTA to transform the estimate of h2 on the observed scale to that on the underlying scale. So, in this case, the option --prevalence will not be working. I will consider that in the future. Hi Jian, Sorry to ask again. I guess I meant that I was confused of how to fit a binary and a continuous trait in a bi variate analysis. So say the prevalence of the binary trait is 0.10, what would the command in GCTA be to fit it along side the continuous trait. Can you give an example please. I very much appreciate your help. Thank you
|
|
|
Post by Jian Yang on Mar 17, 2014 1:48:35 GMT
For bivariate GREML analysis of a case control study and a quantitive trait, you can simply regard the case-control study as a 0-1 trait. So, it would be something similar as if you are analysing two quantitative traits. As I said before, the --prevalence option is not working in this case.
|
|
|
Post by modi2020 on Mar 18, 2014 22:04:41 GMT
For bivariate GREML analysis of a case control study and a quantitive trait, you can simply regard the case-control study as a 0-1 trait. So, it would be something similar as if you are analysing two quantitative traits. As I said before, the --prevalence option is not working in this case. Got it!  Thank you Jian
|
|