Reputation: 4314
I've looked through a ton of regex questions similar to mine, but all seem very complicated or don't work when I replace the value they are interested in (for instance a comma), with the value I'm interested in matching (underscore).
Basically, I want to match only the first underscore in each line of the following example:
As far as I can tell,
_+?
should work, but doesn't. Still matches all. Same for
_{1}
should also work, but it matches all, not just the first as the quantifier specifies.
Example:
armsling_R_1_Group
armsling_R_1_Rank
armsling_R_2_Group
armsling_R_2_Rank
armsling_R_3_Group
armsling_R_3_Rank
armsling_R_4_Group
armsling_R_4_Rank
armsling_C_1
armsling_F_1
armsling_T_1
armsling_T_2
armsling_T_3
armsling_T_4
Edit: This is for R code, but I've been using regexr.com to test my expressions
Upvotes: 1
Views: 395
Reputation: 70750
I'm trying to separate these values (which are in one column) into two columns using
separate()
fromtidyr
. If I just use underscore it looks at the following ones as well.
Based off the comments in the posted answer, the following should work for you.
library(tidyr)
separate(x, y, c('icon', 'measure'), '_', extra = 'merge')
# icon measure
# 1 armsling R_1_Group
# 2 armsling R_1_Rank
# 3 armsling R_2_Group
...
...
For a regular expression solution, I would utilize strapply
from the gsubfn package:
m <- strapply(as.character(x$y), '([^_]*)_(.*)',
~ c(icon = x, measure = y), simplify = rbind)
X <- as.data.frame(m, stringsAsFactors = FALSE)
# icon measure
# 1 armsling R_1_Group
# 2 armsling R_1_Rank
# 3 armsling R_2_Group
...
...
Upvotes: 2