|
Post by runner on Dec 22, 2014 21:48:37 GMT
hi everyone,
i have a question concerning the input for GCTA. I have .impute2 and .impute2_info files for chr 1 to 22, filtered with info>0.6 and maf 0.01. What would be the best way to calculate the GRM, based on this data. Is there a tool to convert the impute2 data to the MACH dosage format? Or would it be better to use hard-called genotypes (with e.g. 90% threshold for the probability)?
thanks for your help!
|
|
|
Post by Jian Yang on Jan 7, 2015 23:45:15 GMT
The easiest way is to convert impute2 dosage dat to hard-called genotypes in PLINK format.
|
|
|
Post by runner1 on Jan 13, 2015 14:41:53 GMT
thanks for your answer! when i convert to hard-call genotypes, which threshold would you suggest? should i run a QC afterwards, because i think there will be some SNPs with high missing rate when i use 90% threshold probability.
|
|
|
Post by mandar on May 3, 2016 19:03:52 GMT
I would like to keep the imputation dosage information. I understand that hard calling is probably easiest but is there any method developed to maintain dosage information while creating GRMs? Thank you
|
|
|
Post by Jian Yang on May 5, 2016 4:18:37 GMT
thanks for your answer! when i convert to hard-call genotypes, which threshold would you suggest? should i run a QC afterwards, because i think there will be some SNPs with high missing rate when i use 90% threshold probability. I usually convert dosage codes from 0-0.5 to 0, 0.5-1.5 to 1 and 1.5-2.0 to 2. For imputation R2 (IMPUTE-INFO), I would use a threshold of 0.3 (see Yang et al. 2015 NG www.nature.com/ng/journal/v47/n10/full/ng.3390.html).
|
|
|
Post by Jian Yang on May 5, 2016 4:34:17 GMT
I would like to keep the imputation dosage information. I understand that hard calling is probably easiest but is there any method developed to maintain dosage information while creating GRMs? Thank you Please check --dosage-mach option. Currently GCTA only supports dosage data from MACH.
|
|