Post by demis on Jul 6, 2016 15:36:52 GMT
Hello,
I am trying to run GCTA cojo-slct, but get the following error: "residual variance is out of boundary, the model is over-fitting. Please specify a more stringent p cutoff value." I have tried modifying the p-value cutoff, but either get the same error, or the cutoff is so stringent that no SNPs are selected.
The input file is based on targeted sequencing data, so does not cover the whole genome, but ~20 regions. I wonder if this could be causing the issue, and if there is any way to resolve it.
Any help on what might be causing the error would be greatly appreciated!
Thanks in advance for your help!
Here is the output:
gcta64 --bfile data_QC --cojo-file summary_all.txt --cojo-slct --cojo-p 5e-8 --cojo-collinear 0.9 --cojo-actual-geno --out test
*******************************************************************
* Genome-wide Complex Trait Analysis (GCTA)
* version 1.26.0
* (C) 2010-2016, The University of Queensland
* MIT License
* Please report bugs to: Jian Yang <jian.yang@uq.edu.au>
*******************************************************************
Analysis started: Wed Jul 6 16:27:33 2016
Options:
--bfile data_QC
--cojo-file summary_all.txt
--cojo-slct
--cojo-p 5e-08
--cojo-collinear 0.9
--cojo-actual-geno
--out test
Reading PLINK FAM file from [data_QC.fam].
2829 individuals to be included from [data_QC.fam].
Reading PLINK BIM file from [data_QC.bim].
79508 SNPs to be included from [data_QC.bim].
Reading PLINK BED file from [data_QC.bed] in SNP-major format ...
Genotype data for 2829 individuals and 79508 SNPs to be included from [data_QC.bed].
Reading GWAS summary-level statistics from [summary_all.txt] ...
GWAS summary statistics of 38960 SNPs read from [summary_all.txt].
Phenotypic variance estimated from summary statistics of all 38960 SNPs: 4.87022 (variance of logit for case-control studies).
Matching the GWAS meta-analysis results to the genotype data ...
38960 SNPs are matched to the genotype data.
Calculating allele frequencies ...
Calculating the variance of SNP genotypes ...
Performing stepwise model selection on 38960 SNPs to select association signals ... (p cutoff = 5e-08; collinearity cutoff = 0.9)
5 associated SNPs have been selected.
10 associated SNPs have been selected.
Error: residual variance is out of boundary, the model is over-fitting. Please specify a more stringent p cutoff value.
I am trying to run GCTA cojo-slct, but get the following error: "residual variance is out of boundary, the model is over-fitting. Please specify a more stringent p cutoff value." I have tried modifying the p-value cutoff, but either get the same error, or the cutoff is so stringent that no SNPs are selected.
The input file is based on targeted sequencing data, so does not cover the whole genome, but ~20 regions. I wonder if this could be causing the issue, and if there is any way to resolve it.
Any help on what might be causing the error would be greatly appreciated!
Thanks in advance for your help!
Here is the output:
gcta64 --bfile data_QC --cojo-file summary_all.txt --cojo-slct --cojo-p 5e-8 --cojo-collinear 0.9 --cojo-actual-geno --out test
*******************************************************************
* Genome-wide Complex Trait Analysis (GCTA)
* version 1.26.0
* (C) 2010-2016, The University of Queensland
* MIT License
* Please report bugs to: Jian Yang <jian.yang@uq.edu.au>
*******************************************************************
Analysis started: Wed Jul 6 16:27:33 2016
Options:
--bfile data_QC
--cojo-file summary_all.txt
--cojo-slct
--cojo-p 5e-08
--cojo-collinear 0.9
--cojo-actual-geno
--out test
Reading PLINK FAM file from [data_QC.fam].
2829 individuals to be included from [data_QC.fam].
Reading PLINK BIM file from [data_QC.bim].
79508 SNPs to be included from [data_QC.bim].
Reading PLINK BED file from [data_QC.bed] in SNP-major format ...
Genotype data for 2829 individuals and 79508 SNPs to be included from [data_QC.bed].
Reading GWAS summary-level statistics from [summary_all.txt] ...
GWAS summary statistics of 38960 SNPs read from [summary_all.txt].
Phenotypic variance estimated from summary statistics of all 38960 SNPs: 4.87022 (variance of logit for case-control studies).
Matching the GWAS meta-analysis results to the genotype data ...
38960 SNPs are matched to the genotype data.
Calculating allele frequencies ...
Calculating the variance of SNP genotypes ...
Performing stepwise model selection on 38960 SNPs to select association signals ... (p cutoff = 5e-08; collinearity cutoff = 0.9)
5 associated SNPs have been selected.
10 associated SNPs have been selected.
Error: residual variance is out of boundary, the model is over-fitting. Please specify a more stringent p cutoff value.