### Post by user987 on Jun 4, 2019 2:10:20 GMT

I'm using R to simulate GWAS data, and then using the mixed.solve() function from the "rrBLUP" package to find REML estimates of the variance components and heritability. This is the mixed.solve() function: www.rdocumentation.org/packages/rrBLUP/versions/4.6/topics/mixed.solve

But then I noticed that GCTA has standard errors for the REML estimates of the h^2 and the variance components, and I'd like to get the standard errors to compute confidence intervals. So I'm trying to convert my genotype matrix in R to GRM to input the GRM in GCTA. Is this the correct way to do it?

If Z, an nxp matrix, is a scaled genotype matrix where n is the number of individuals and p is the number of SNPs, then the GRM is grm_data = ZZ' / p.

Then, use the write_GRMBin() function from the "OmicKriging" package to write a binary GRM format recently introduced by GCTA. This is the function: www.rdocumentation.org/packages/OmicKriging/versions/1.4.0/topics/write_GRMBin .

In particular, do

1.) In R, do:

write_GRMBin(grm_data, n.snps=p, prefix="grm_data")

write.table(cbind(1:n, 1:n), file = 'grm_data.grm.id' , quote=F, row.names=F, col.names=F)

2.) Then, in GCTA, do this to get the REML estimates of the variance components, h^2, and their standard errors:

/gcta_1.91.6beta/gcta64 --grm grm_data --pheno pheno_data --reml --reml-est-fix --out gcta_reml_est

P.S., this is the write_GRMBin() function:

write_GRMBin <- function(X, n.snps = 0.0, prefix, size = 4) {

sum_i <- function(i){

return(sum(1:i))

}

## file connections

NFileName <- paste(prefix,".grm.N.bin",sep="")

IDFileName <- paste(prefix,".grm.id",sep="")

BinFileName <- paste(prefix,".grm.bin",sep="")

## pull sample ids and dimension of GRM

id <- rownames(X)

n <- length(id)

## pull diagonal elements

diag.elem <- diag(X)

## pull upper triangular elements (column major order)

off.diag.elem <- X[upper.tri(X, diag=FALSE)]

## collapse GRM into vector

i <- sapply(1:n, sum_i)

collapsed.grm <- vector(mode="numeric", length = n*(n+1)/2)

collapsed.grm

But then I noticed that GCTA has standard errors for the REML estimates of the h^2 and the variance components, and I'd like to get the standard errors to compute confidence intervals. So I'm trying to convert my genotype matrix in R to GRM to input the GRM in GCTA. Is this the correct way to do it?

If Z, an nxp matrix, is a scaled genotype matrix where n is the number of individuals and p is the number of SNPs, then the GRM is grm_data = ZZ' / p.

Then, use the write_GRMBin() function from the "OmicKriging" package to write a binary GRM format recently introduced by GCTA. This is the function: www.rdocumentation.org/packages/OmicKriging/versions/1.4.0/topics/write_GRMBin .

In particular, do

1.) In R, do:

write_GRMBin(grm_data, n.snps=p, prefix="grm_data")

write.table(cbind(1:n, 1:n), file = 'grm_data.grm.id' , quote=F, row.names=F, col.names=F)

2.) Then, in GCTA, do this to get the REML estimates of the variance components, h^2, and their standard errors:

/gcta_1.91.6beta/gcta64 --grm grm_data --pheno pheno_data --reml --reml-est-fix --out gcta_reml_est

**Is this the correct way to convert the genotype matrix in R to a usable format for GCTA to get REML estimates and their standard errors?****Also, is the confidence interval the REML estimate +/- Z_{\alpha/2} * standard error, where Z_{\alpha/2} is the z score for percentile?****----**P.S., this is the write_GRMBin() function:

write_GRMBin <- function(X, n.snps = 0.0, prefix, size = 4) {

sum_i <- function(i){

return(sum(1:i))

}

## file connections

NFileName <- paste(prefix,".grm.N.bin",sep="")

IDFileName <- paste(prefix,".grm.id",sep="")

BinFileName <- paste(prefix,".grm.bin",sep="")

## pull sample ids and dimension of GRM

id <- rownames(X)

n <- length(id)

## pull diagonal elements

diag.elem <- diag(X)

## pull upper triangular elements (column major order)

off.diag.elem <- X[upper.tri(X, diag=FALSE)]

## collapse GRM into vector

i <- sapply(1:n, sum_i)

collapsed.grm <- vector(mode="numeric", length = n*(n+1)/2)

collapsed.grm

*<- diag.elem*

collapsed.grm[-i] <- off.diag.elem

## write binary files

BinFile <- file(BinFileName, "wb")

NFile <- file(NFileName, "wb")

writeBin(con = BinFile, collapsed.grm, size = size)

writeBin(con = NFile, rep(n.snps, n*(n+1)/2), size = size )

close(BinFile)

close(NFile)

## write sample ID file -- we are dropping sample family IDs here

write.table(cbind(id, id), file = IDFileName, col.names=FALSE,

row.names=FALSE, quote=FALSE)

}collapsed.grm[-i] <- off.diag.elem

## write binary files

BinFile <- file(BinFileName, "wb")

NFile <- file(NFileName, "wb")

writeBin(con = BinFile, collapsed.grm, size = size)

writeBin(con = NFile, rep(n.snps, n*(n+1)/2), size = size )

close(BinFile)

close(NFile)

## write sample ID file -- we are dropping sample family IDs here

write.table(cbind(id, id), file = IDFileName, col.names=FALSE,

row.names=FALSE, quote=FALSE)

}