Post by zkuncheva on Oct 12, 2021 16:08:26 GMT
Hi,
We are running the latest beta version of gcta for the following command using a downsampled uk biobank dataset :
gcta64 --bfile /ukb/plink_format/ukb_v3_chr5.downsampled10k --chr 5 --extract GTEX_v8.snplist.txt --maf 0.005 --cojo-p 1e-05 --cojo-wind 2000 --cojo-collinear 0.9 --cojo-file GTEX_v8.gcta_format.tsv --cojo-slct --out GTEX_v8.gcta_out
It works fine for all but three chromosomes (5,6, and 8). For all chromosomes (but 5,6,8) once the stepwise selection is conducted, rather the starting to select variables, we obtain segmentation fault. We tried changing collinearity threshold + also tried using different threat numbers, but nothing worked. Any ideas why we get this error only for some chromosomes but not for others?
Reading PLINK FAM file from [ukb_v3_chr5.downsampled10k.fam].
10000 individuals to be included from [ukb_v3_chr5.downsampled10k.fam].
Reading PLINK BIM file from [ukb_v3_chr5.downsampled10k.bim].
1006335 SNPs to be included from [ukb_v3_chr5.downsampled10k.bim].
Reading a list of SNPs from [GTEX_v8_chr5.snplist.txt].
12387 SNPs are extracted from [GTEX_v8_chr5.snplist.txt].
12387 SNPs on chromosome 5 are included in the analysis.
Reading PLINK BED file from [ukb_v3_chr5.downsampled10k.bed] in SNP-major format ...
Genotype data for 10000 individuals and 12387 SNPs to be included from [ukb_v3_chr5.downsampled10k.bed].
Calculating allele frequencies ...
Filtering SNPs with MAF > 0.005 ...
After filtering SNPs with MAF > 0.005, there are 12380 SNPs (7 SNPs with MAF < 0.005).
Reading GWAS summary-level statistics from [GTEX_v8_chr5.gcta_format.tsv] ...
GWAS summary statistics of 13800 SNPs read from [GTEX_v8_chr5.gcta_format.tsv].
Phenotypic variance estimated from summary statistics of all 13800 SNPs: 0.503764 (variance of logit for case-control studies).
Matching the GWAS meta-analysis results to the genotype data ...
5 SNP(s) have large difference of allele frequency among the GWAS summary data and the reference sample. These SNPs have been saved in [GTEX_v8_chr5.gcta_out.freq.badsnps].
12375 SNPs are matched to the genotype data.
Calculating the variance of SNP genotypes ...
10000 individuals to be included from [ukb_v3_chr5.downsampled10k.fam].
Reading PLINK BIM file from [ukb_v3_chr5.downsampled10k.bim].
1006335 SNPs to be included from [ukb_v3_chr5.downsampled10k.bim].
Reading a list of SNPs from [GTEX_v8_chr5.snplist.txt].
12387 SNPs are extracted from [GTEX_v8_chr5.snplist.txt].
12387 SNPs on chromosome 5 are included in the analysis.
Reading PLINK BED file from [ukb_v3_chr5.downsampled10k.bed] in SNP-major format ...
Genotype data for 10000 individuals and 12387 SNPs to be included from [ukb_v3_chr5.downsampled10k.bed].
Calculating allele frequencies ...
Filtering SNPs with MAF > 0.005 ...
After filtering SNPs with MAF > 0.005, there are 12380 SNPs (7 SNPs with MAF < 0.005).
Reading GWAS summary-level statistics from [GTEX_v8_chr5.gcta_format.tsv] ...
GWAS summary statistics of 13800 SNPs read from [GTEX_v8_chr5.gcta_format.tsv].
Phenotypic variance estimated from summary statistics of all 13800 SNPs: 0.503764 (variance of logit for case-control studies).
Matching the GWAS meta-analysis results to the genotype data ...
5 SNP(s) have large difference of allele frequency among the GWAS summary data and the reference sample. These SNPs have been saved in [GTEX_v8_chr5.gcta_out.freq.badsnps].
12375 SNPs are matched to the genotype data.
Calculating the variance of SNP genotypes ...
Performing stepwise model selection on 12375 SNPs to select association signals ... (p cutoff = 1e-05; collinearity cutoff = 0.9)
(Assuming complete linkage equilibrium between SNPs which are more than 2Mb away from each other)
Segmentation fault (core dumped)
(Assuming complete linkage equilibrium between SNPs which are more than 2Mb away from each other)
Segmentation fault (core dumped)