Ramendra Sarma
Ramendra Sarma

Reputation: 83

puting population information to a VCF file

I have a VCF files without population information. I have three test files (pop1.txt, pop2.txt and pop3.txt) containing the names of the samples. How do I combine population information to that VCF file in R or another way?

Upvotes: 1

Views: 129

Answers (1)

dthorbur
dthorbur

Reputation: 1095

There are a few ways you can do this.

  1. Upstream, you could have names the samples with population notations included. For example, I named the fastq/bam files in an experiment No_L_1 and No_R_1 for sample number 1 in the Norwegian lake and stream dataset I had.

  2. Use something like sed or awk to loop through the population ID's and change the VCF sample column names to something more intuitive.

  3. In R, read the data in using a library like vcfR, and then change the sample on the R object. I tend to read data in and convert the vcfR.object into a data.table. (i.e., vcf <- read.vcfR(vcf_path) %>% as.data.table))

Regardless, it would probably easiest to have all your population data in a single csv if you're doing analyses in R, with column 1 being sample_id, and column 2 being population.

Upvotes: 0

Related Questions