Logan McDonald
Logan McDonald

Reputation: 85

grepl help in R

I have a large dataset. In the dataset there are a bunch of names, but for reasons in how the data was entered I only need the names with one word in them. I was thinking of using grepl to grab any blank spaces in the words but I would also need to do this for "-". I need only observations with one word in this variable. So far

more_than_one_word <- mydata[grepl("\s", mydata$City) , ]

doesn't pick up anything like "Sussie James." What else can I do? Thanks.

Upvotes: 2

Views: 354

Answers (2)

RHertel
RHertel

Reputation: 23818

You could try

only_one_word <- mydata[which(!grepl(" |-", mydata$City)), ] 

Example:

cities <- c("Los Angeles", "New York", "Chicago", "Aix-en-Provence")
#> cities[which(!grepl(" |-",cities))]
#[1] "Chicago"

That's if you need to remove any entry with a hyphen, too.

#> cities[which(!grepl(" ",cities))]
#[1] "Chicago"         "Aix-en-Provence"

Hope this helps.

Upvotes: 2

Se&#241;or O
Se&#241;or O

Reputation: 17432

I would take the approach of saying "Give me any string that's just letters!"

> vec = c(" ", "hi", "Chicago", "new york", "New_York")
> vec
[1] " "        "hi"       "Chicago"  "new york" "New_York"
> grep("^[a-zA-Z]*$", vec)
[1] 2 3

This will accept any string that is just letters from the first character to the last.

Upvotes: 2

Related Questions