Reputation: 456
I would like to extract the 2 matching groups using R. Right now I've got this, but is not working well:
Code:
str = '123abc'
vector <- gregexpr('(?<first>\\d+)(?<second>\\w+)', str, perl=TRUE)
regmatches(str, vector)
Result:
[[1]]
[1] "123abc"
I want the result to be something like this:
[1] "123"
[2] "abc"
Upvotes: 0
Views: 130
Reputation: 270298
Try this:
> library(gsubfn)
> strapplyc("123abc", '(\\d+)(\\w+)')[[1]]
[1] "123" "abc"
Upvotes: 0
Reputation: 42689
I've renamed your string s
to avoid clobbering str
. Here is one approach:
library(stringr)
s <- '123abc'
reg <- '([[:digit:]]+)([[:alpha:]]+)'
complete <- unlist(str_extract_all(s, reg))
partials <- unlist(str_match_all(s, reg))
partials <- partials[!(partials %in% complete)]
partials
[1] "123" "abc"
Upvotes: 1
Reputation: 91
I'm not sure if you have a specific reason for using regmatches
, unless you are e.g. importing the expressions in that format. If well-defined groups are common to all your entries, you can match them in this way:
x <- "123abc"
sub("([[:digit:]]+)[[:alpha:]]+","\\1",x)
sub("[[:digit:]]+([[:alpha:]]+)","\\1",x)
Result
[1] "123"
[1] "abc"
I.e., match the entire structure of the string, then replace it with the part you want to retain by enclosing it in round brackets and referring to it with a backreference ("\\1").
Upvotes: 2
Reputation: 4509
Depending on how well structured your inputs are, you may want to use strsplit
to split the string.
Documentation here.
Upvotes: 0