
Post by Jian Yang on Dec 15, 2016 0:17:18 GMT
Principal component analysis
pca 20 Input the GRM and output the first n (n = 20, by default) eigenvectors (saved as *.eigenvec, plain text file) and all the eigenvalues (saved as *.eigenval, plain text file), which are equivalent to those calculated by the program EIGENSTRAT. The only purpose of this option is to calculate the first m eigenvectors, and subsequently include them as covariates in the model when estimating the variance explained by all the SNPs (see below for the option of estimating the variance explained by genomewide SNPs). Please find the EIGENSTRAT software if you need more sophisticated principal component analysis of the population structure.
Output file format test.eigenval (no header line; the first m eigenvalues)
20.436 7.1293 6.7267 ...... test.eigenvec (no header line; the first m eigenvectors; columns are family ID, individual ID and the first m eigenvectors)
011 0101 0.00466824 0.000947 0.00467529 0.00923534 012 0102 0.00139304 0.00686406 0.0129945 0.00681755 013 0103 0.00457615 0.00287646 0.00420995 0.0169046 ......
Examples # Input the GRM file and output the first 20 eigenvectors for a subset of individuals
gcta64 grm test keep test.indi.list pca 20 out test

