|
Post by hector on May 3, 2017 13:21:47 GMT
Hello! I am trying to simulate a case-control phenotype. However, the results do not match what I would expect. Particularly there are two things I don't understand: Distribution of chi-squared p-values in the causal SNPs (genotype 0,1 or 2 vs phenotype): At the very least, I would expect an uniform distribution. This kind of normal distribution centered around 0.5 is the opposite of what I would expect... At the very least most of the SNPs should have an association with the phenotype, in one direction or the other... Also, there is a lack of correlation between the effect size and the p-values: All in all, it seems to me there is something I am missing. Here is the command I am using: gcta64 --bfile genotypes --simu-cc 1283 1283 --simu-causal-loci causal_snps.txt --simu-hsq 1 --simu-k 0.5 --simu-rep 20 --out simu In particular what I think is the problem (and, if possible I would like some further explanation) is here: - --simu-cc 1283 1283: I have 2566 samples, and I want half cases and half controls.
- --simu-k 0.5: the prevalence. I do not understand why it's necessary, as the prevalence of the phenotype is not important in this experiment where the proportions of cases and controls are not representative of the population. However, I need to specify 0.5 to make sure I can have the same number of cases and controls.
Can you point out if I am doing something incorrectly? Thank you in advance!
|
|
|
Post by summaira on Jun 7, 2017 13:31:46 GMT
Hey I want to ask you the format for genotype file. I am also trying to do the same simulation and very confused with input file.
|
|
|
Post by Zhihong Zhu on Jun 24, 2017 5:53:35 GMT
Hey I want to ask you the format for genotype file. I am also trying to do the same simulation and very confused with input file. Hi there, As described at GCTA website, # Simulate 500 cases and 500 controls with the heritability of liability of 0.5 and disease prevalence of 0.1 for 3 times gcta64 --bfile test --simu-cc 500 500 --simu-causal-loci causal.snplist --simu-hsq 0.5 --simu-k 0.1 --simu-rep 3 --out test can you please let me know which file is not clear for you? Cheers, Zhihong
|
|
|
Post by Zhihong Zhu on Jun 24, 2017 5:57:28 GMT
Hello! I am trying to simulate a case-control phenotype. However, the results do not match what I would expect. Particularly there are two things I don't understand: Distribution of chi-squared p-values in the causal SNPs (genotype 0,1 or 2 vs phenotype): At the very least, I would expect an uniform distribution. This kind of normal distribution centered around 0.5 is the opposite of what I would expect... At the very least most of the SNPs should have an association with the phenotype, in one direction or the other... Also, there is a lack of correlation between the effect size and the p-values: All in all, it seems to me there is something I am missing. Here is the command I am using: gcta64 --bfile genotypes --simu-cc 1283 1283 --simu-causal-loci causal_snps.txt --simu-hsq 1 --simu-k 0.5 --simu-rep 20 --out simu In particular what I think is the problem (and, if possible I would like some further explanation) is here: - --simu-cc 1283 1283: I have 2566 samples, and I want half cases and half controls.
- --simu-k 0.5: the prevalence. I do not understand why it's necessary, as the prevalence of the phenotype is not important in this experiment where the proportions of cases and controls are not representative of the population. However, I need to specify 0.5 to make sure I can have the same number of cases and controls.
Can you point out if I am doing something incorrectly? Thank you in advance! Hi there, --simu-k gives GCTA the disease prevalence, which is different from sample prevalence (--simu-cc). For example, k = 0.1, n = 10k, the maximum number of cases is 1k. And --simu-cc provides the number of cases and controls in the sample. Cheers, Zhihong
|
|
|
Post by summaira on Oct 20, 2017 9:40:55 GMT
I would like to know that in GCTA simulation, is their a possibilty to output simulated genotypes? or it only function to output phenotypes"
|
|