marie
marie

Reputation: 223

subsetting multiple datapoints from dataset

I am trying to extract a subset from a huge dataset. The following code is working for extracting a single datapoint from the dataset.

   write.csv(subset(project, grepl("^UN1705.* ", Trial_group) ), file="kiki.csv")

How can I tell R that I want to extract multiple datapoints? I tried commas, semicolons etc, but nothing worked:

  write.csv(subset(project, grepl("^UN1705, UN1706.* ", Trial_group) ), file="kiki.csv")

Upvotes: 1

Views: 194

Answers (2)

Ari B. Friedman
Ari B. Friedman

Reputation: 72741

To combine logical vectors, use & and |, for AND and OR respectively.

grepl("^UN1705.* ", Trial_group) & grepl("^UN1706.* ", Trial_group)

Just for fun, benchmarks!

Trial_group <- sample(letters,10^5,replace=TRUE)
library(microbenchmark)
microbenchmark(
  grepl("^b.*|^c.*", Trial_group) ,
  grepl("^b.*", Trial_group) | grepl("^c.*", Trial_group)
)

Unit: milliseconds
                                                     expr      min       lq   median       uq      max
1                         grepl("^b.*|^c.*", Trial_group) 15.25969 15.73327 15.95457 16.37784 18.89444
2 grepl("^b.*", Trial_group) | grepl("^c.*", Trial_group) 27.39136 28.18150 28.65988 29.47160 49.31859

Looks like doing the logical OR within the regular expression is faster.

Upvotes: 1

DrDom
DrDom

Reputation: 4123

Or you can combine these queries in one regex

grepl("^UN1705.* |^UN1706.* ", Trial_group)

Upvotes: 2

Related Questions