DaniCee
DaniCee

Reputation: 3207

R: sub() using vector as pattern

Say I have a data frame like the following:

df=data.frame(a=LETTERS, b=paste0(LETTERS,1:length(LETTERS)))

It looks like this:

> df
   a   b
1  A  A1
2  B  B2
3  C  C3
4  D  D4
5  E  E5
6  F  F6
7  G  G7
8  H  H8
9  I  I9
10 J J10
11 K K11
12 L L12
...

The only thing I want to do is remove df$a from df$b, so that the resulting data frame looks like:

> df
   a  b
1  A  1
2  B  2
3  C  3
4  D  4
5  E  5
6  F  6
7  G  7
8  H  8
9  I  9
10 J 10
11 K 11
12 L 12
...

For that, I want to explicitly use sub() with df$a as pattern. This data frame is just an example, so I do not want to use strsplit() or a specific regex in sub() (cause my df$a can get pretty complicated).

I try:

df$b=sub(paste0("^",df$a) , "", df$b)

But obviously I get:

Warning message: In sub(paste0("^", df$a), "", df$b) : argument 'pattern' has length > 1 and only the first element will be used

So what would be the right way to do this? Thanks!

Upvotes: 1

Views: 155

Answers (1)

Rui Barradas
Rui Barradas

Reputation: 76402

Use mapply to remove df1$a from df$b.

df <- data.frame(a=LETTERS, b=paste0(LETTERS,1:length(LETTERS)))

mapply(\(x, y) sub(x, "", y), df$a, df$b)
#>    A    B    C    D    E    F    G    H    I    J    K    L    M    N    O    P 
#>  "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9" "10" "11" "12" "13" "14" "15" "16" 
#>    Q    R    S    T    U    V    W    X    Y    Z 
#> "17" "18" "19" "20" "21" "22" "23" "24" "25" "26"

Created on 2022-03-16 by the reprex package (v2.0.1)

To assign back to df$b, keeping as characters, run

df$b <- mapply(\(x, y) sub(x, "", y), df$a, df$b)
head(df)
#>   a b
#> 1 A 1
#> 2 B 2
#> 3 C 3
#> 4 D 4
#> 5 E 5
#> 6 F 6

Created on 2022-03-16 by the reprex package (v2.0.1)

Upvotes: 2

Related Questions