Mario Trost
Mario Trost

Reputation: 107

R: mapply(gsub...) gives different results than gsub(...)

Part of my data:

data <- c('googel', 'googele', 'googl', 'google .de', 'google kalender',
        'google maps', 'google.ch', 'www.google.ch', 'factbook', 'facebock',
        'facebok', 'facebook', 'facebook.ch', 'facebook.com', 'facebook.de', 'facebooke')

I have to replace all google-like words with 'Google' and all facebook-like words with 'Facebook'. I can do this with the following code:

### Google coding
> google <- gsub(pattern = '.*go.*g.*l.*', replacement = 'Google', data)

### Facebook coding
> fbGoogle <- gsub(pattern = '.*fa.*bo.*k.*', replacement = 'Facebook', google)
> plyr::count(fbGoogle)
         x freq
1 Facebook    8
2   Google    8

I would like to do this using mapply, a vector for patterns and one for replacements. Although I use the same (quite primitive, I know) regex, I get different results than before:

> ### Google and Facebook togeter
> patterns <- c('.*go.*g.*l.*', '.*fa.*bo.*k.*')
> replacements <- c('Google', 'Facebook')
> fbGoogleFail <- mapply(gsub, patterns, replacements, data)
> plyr::count(fbGoogleFail)
               x freq
1        facebok    1
2       Facebook    4
3    facebook.ch    1
4    facebook.de    1
5       factbook    1
6        googele    1
7         Google    4
8     google .de    1
9    google maps    1
10 www.google.ch    1

Ideas where I fail here? Any help is much appreciated.

Upvotes: 1

Views: 360

Answers (1)

user2100721
user2100721

Reputation: 3587

Try this

gsubv <- Vectorize(function(pat,repl,data) gsub(pattern = pat,replacement = repl,x = data),vectorize.args = c("pat","repl"))
output <- gsubv(pat = patterns,repl = replacements,data = data)
output[-match(data,output)]
#[1] "Google"   "Google"   "Google"   "Google"   "Google"   "Google"   "Google"  
#[8] "Google"   "Facebook" "Facebook" "Facebook" "Facebook" "Facebook" "Facebook"
#[15] "Facebook" "Facebook"

Upvotes: 0

Related Questions