Dara Vakili
Dara Vakili

Reputation: 1

Using R extract all rows that contain a string from a variable

I have a metadata file stored as a .tsv which I read into R and save as META. I need to extract all rows containing a given string "male", here stored in variable sample.

The full script has a lot of these operations and so it's important that I store the pattern in sample below. The errors are in the way I am trying to grep.

IN <- "/home/zchadva/Scratch/output/cov"

#metadata
META <- read.table("/home/zchadva/Scratch/data/hipsci/rnaseq/hipsci.qc1_sample_info.20160926.tsv", header = TRUE, sep = "\t")

#Set study/table variables
sample <- "\\<male\\>"
control <- "female"

#Grep all rows containing "male" from the table META
sample.list <- META[grep(sample, META, value=TRUE)]

EDIT: This has got me closer

Ideally I do not want to use META$Gender to specify a coloumn each time I need to do a pattern search as our real metadata file is humungous. If I do need to specify, I would like to have Gender in a variable

sample.list <- (META[grep(sample, META$Gender), ]

For example:

**coloumn** <- Gender
sample.list <- (META[grepl(sample, META$**coloumn**), ]

#Table example simplified
ID    Disease    Gender    Cell
JX1   ibd        male      liver
PTY   healthy    male      liver
HB3   ibd        female    brain
PO3   bbs        male      

#Desired layout in sample.list
JX1   ibd        male      liver
PTY   healthy    male      liver
PO3   bbs        male      

Any Help Greatly Appreciated. I have tried to do this for hours

Upvotes: 0

Views: 2354

Answers (1)

Benjamin
Benjamin

Reputation: 17369

grepl will give you better results than grep, since you can use the logical vector to index your data frame.

META <- 
  data.frame(ID = c("JX1", "PTY", "HB3", "PO3"),
             Disease = c("ibd", "healthy", "ibd", "bbs"),
             Gender = c("male", "male", "female", "male"),
             Cell = c("liver", "liver", "brain", "liver"))

sample <- "male"
control <- "female"

META[grepl("^male", META$Gender), ]

   ID Disease Gender  Cell
1 JX1     ibd   male liver
2 PTY healthy   male liver
4 PO3     bbs   male liver

Upvotes: 1

Related Questions