Reputation: 407
Hi I am trying to kill two bird with one stone.
Firstly if col b is populated get it to new (no issue here) and secondly if col b is blank extract part of the string - everything after Task and before space and input to "new".
a <- c("11-010 Bla", "TASK 21 MMM", "TASK 03-11-11 Hah")
b <- c("11-010","","")
new <- c("","","")
df <- data.frame(a,b,new)
a b new
11-010 Bla 11-010
TASK 21 MMM
TASK 03-11-11 Hah
Output:
a b new
11-010 Bla 11-010 11-010
TASK 21 MMM 21
TASK 03-11-11 Hah 03-11-11
I tried to get the task number using below but I am unable to add space to it. The task number is always followed by space.
gsub("^[^_]*TASK|\\.[^.]*\\s$", "", df$a)
sub(".*?TASK=(.*?)' '.*", "\\1", df$a)
Upvotes: 0
Views: 90
Reputation: 79218
sub("?(.*\\s)?(\\d.*?\\s).*","\\2",a)
[1] "11-010 " "21 " "03-11-11
regmatches(a,regexpr("\\d.*?\\s",a))
[1] "11-010 " "21 " "03-11-11 "
Upvotes: 0
Reputation: 24074
You can capture, in case b
is an empty string everything that is between "TASK " and the space with the following regex
:
sub(".*TASK ([^ ]+) .+", "\\1", df$a[df$b==""])
# [1] "21" "03-11-11"
\\1
permits to capture what is in between brackets in the regex, which, in this case, is [^ ]+
: anything but a space, one or more times.
You can put that directly in df
with:
df$new[df$b==""] <- sub(".*TASK ([^ ]+) .+", "\\1", df$a[df$b==""])
# a b new
#1 11-010 Bla 11-010 11-010
#2 TASK 21 MMM 21
#3 TASK 03-11-11 Hah 03-11-11
Upvotes: 2