Harry Caulton
Harry Caulton

Reputation: 31

How to subset using partially matching strings?

I'm trying to isolate participants who are categorised as either NJS or ELJ, followed by a number eg NJS1, NJS2, ELJ8, ELJ25 etc. I remember that there's a symbol I'm looking for which means "selected cells contain "XYZ" followed by anything", allowing my participants to be separated into the two groups. I have tried the following, to no avail.

NJSBio = subset(biography, biography$`L1(s)` == "NJS#")
//
NJSBio = subset(biography, biography$`L1(s)` == "NJS?")
//
NJSBio = subset(biography, biography$`L1(s)` == "NJS*")

I have tried to find the answer using the "Help" function in RStudio and using Google, but I'm guessing that my search terms are too vague. Could anyone help refresh my memory?

Upvotes: 2

Views: 48

Answers (1)

iago
iago

Reputation: 3256

If the columns is called L1(s) you may try:

library(dplyr)
NJSBio = filter(biography, grepl("NJS.*", `L1(s)`))

Or adapting the last of your options

NJSBio = subset(biography, grepl("NJS.*", biography$`L1(s)`))

should also work.

But to avoid problems and as a more general comment, it is better to avoid thhe use of parentheses in variable names.

Upvotes: 1

Related Questions