Reputation: 997
I have a dataframe that looks as follows:
df <- data.frame(one=c("s1_below_10", "s2_below_20"),
two=c("s3_above_10","s4_above_10"))
I want to replace all the strings by the number preceding the first underscore. In other words, the desired output is
1 3
2 4
I would like to know how I can perform this replacement (the dataset is very large). Thanks for your help.
Upvotes: 2
Views: 2007
Reputation: 93938
The basic gsub
call would be something like:
gsub("^.+?(\\d+)_.+","\\1",df$one)
[1] "1" "2"
Which you could lapply
to each column:
data.frame(lapply(df, gsub, pattern="^.+(\\d+)_.+",replacement= "\\1"))
one two
1 1 3
2 2 4
Upvotes: 4
Reputation: 60180
If the values you want are always the second character of the string (which seems to be true of all your examples), you can do this with substr
:
data.frame(lapply(df, substr, 2, 2))
Output:
one two
1 1 3
2 2 4
Upvotes: 2