kmangyo
kmangyo

Reputation: 1207

String processing in R (find and replace)

This is an example of my data.frame

no string
1  abc&URL_drf
2  abcdef&URL_efg

I need to replace word *&URL with "". So, I need a this result

no string
1 _drf
2 _efg

In case of Excel, I can easily make this result using '*&URL' in 'find and replace' function. However, I cannot look for effective method in R.

In R, my approach is below.

First, I have split string using strsplit(df$string, "&URL") and then I have selected second column. I think that it is not a effective way.

Is there a any effective method?

Upvotes: 0

Views: 313

Answers (3)

Sven Hohenstein
Sven Hohenstein

Reputation: 81693

Another approach:

df <- transform(df, string = sub(".*&URL", "", string))

#  no string
# 1  1   _drf
# 2  2   _efg

Upvotes: 0

KFB
KFB

Reputation: 3501

# data
df <- read.table(text="no string
1  abc&URL_drf
2  abcdef&URL_efg", header=T, as.is=T)

# `gsub` function is to substitute the unwanted string with nothing, 
# thus the `""`. The pattern of unwanted string was written in 
# regular expressions.

df$string <- gsub("[a-z]+(&URL)", "", df$string)
# you get
  no string
1  1   _drf
2  2   _efg

Upvotes: 3

Ehsan
Ehsan

Reputation: 4474

I suggest you use the grep function .

The grep function takes your regex as the first argument, and the input vector as the second argument.If you pass value=TRUE, then grep returns a vector with copies of the actual elements in the input vector that could be (partially) matched.

so in your case

grep("[a-z]+(&URL)", df$col, perl=TRUE, value=TRUE)

Upvotes: 0

Related Questions