giac
giac

Reputation: 4309

r - grep OR after sequence of digits

So, I have a vector v containing a sequence of digits followed by an indication of day or week. I would like to select the sequence with only day.

v = c('abc_1day', 'abc_2day', 'abc_3day', 'abc_1week', 'abc_2dweek')

I thought the or condition would work here

v[grep('abc_|day', v)] 

Why it isn't?

Upvotes: 3

Views: 1432

Answers (3)

Andrew J. Rech
Andrew J. Rech

Reputation: 466

The OR condition is matching either abc_ or day.

One option is to use a \K, which satisfies the criteria that only day is matched if it is preceeded by abc_ and the digits:

v[grep('abc_[0-9]+\\Kday', v, perl = TRUE)]
[1] "abc_1day" "abc_2day" "abc_3day"

This differs from akrun's grep('^abc_[0-9]+day$', v, value = TRUE), which matches the whole string. Notably, a useful advantage of \K over lookarounds is that \K can be variable length.

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522762

Using grepl:

v[grepl("day", v)]

This assumes that day as a token alone is enough to match the entries you want. If not, you can modify the regex. To also match a number before day you can use:

v[grepl("\\d+day", v)]

Upvotes: 2

akrun
akrun

Reputation: 887971

We can use

grep('^abc_[0-9]+day$', v, value = TRUE)
#[1] "abc_1day" "abc_2day" "abc_3day"

NOTE: This considers the OP's criteria of numbers followed by day at the end of the string and start with 'abc'.

Upvotes: 1

Related Questions