user3375672
user3375672

Reputation: 3768

R, get the n'th occurence of pattern using regex

I have strings like:

s <- "text-32190-3910-text-1671"

I would like to get only the n'th occurence, e.g. the second, of a pattern, e.g. a group of digits (a number, "\d+"). Thus, the 2nd occurence of digits in s would give me "3910". Thought it must be simply done by a grep() (and family) construct, but couldnt find an example at SO (?).

EDIT: Another case would be:

s2 <- "jklsdKSfdkdlsKLLSDK-kdslkSKKSK"

I would then like to get the third occurence of a block of capital letters [A-Z]+, in s2 this would be "SKKSK".

Upvotes: 1

Views: 130

Answers (1)

Stedy
Stedy

Reputation: 7469

The comment by user20650 to use a mix of gregexpr() and regmatches() is a good way to approach this:

R> s <- "text-32190-3910-text-1671"
R> regmatches(s, gregexpr("\\d+", s) )[[1]][2]
[1] "3910"
R> s2 <- "jklsdKSfdkdlsKLLSDK-kdslkSKKSK"
R> regmatches(s2, gregexpr("[A-Z]+", s2) )[[1]][2]
[1] "KLLSDK"

Upvotes: 4

Related Questions