Reputation: 21
I performed a GWAS in PLINK and now I would like to look at the data for a small set of SNPs listed one for each line, in a file called snps.txt
.
I would like to export the data from PLINK for theses specific SNPs into a .txt
or .csv
file. Ideally, this file would have the individual IDs as well as the genotypes for these SNPs so that I could later merge it with my phenotype file and perform additional analyses and plots.
Is there an easy way to do that? I know I can use --extract
to request specific SNPs only but I can't find a way to tell PLINK to export the data to an "exportable" text-based format.
Upvotes: 2
Views: 7076
Reputation: 143
If you are using classic plink (1.07) you should consider upgrading to plink 1.9. It is a lot faster, and supports many more formats. This answer is for plink 1.9.
It sounds like your problem is that you are unable to turn the binary data into a regular plink text file.
This is easy to do with the recode option. It should be used without any parameters to convert to the plink text format:
plink --bfile gwas_file --recode --extract snps.txt --out gwas_file_text
If you want to convert the .ped data to a csv afterwards you could do the following:
cut -d " " -f2-2,7- --output-delimiter=, gwas_file_text.ped
This produces a comma-delimited file with IDs in the first column and then genotypes.
Note that you can also convert the data to a lot of other text-based filetypes, all described in the docs.
One of these is the common variant call format (VCF), which makes a file with the snps and individual IDs all in one file, as requested:
plink --bfile gwas_file --recode vcf --extract snps.txt --out gwas_file_text
Upvotes: 5