Reputation: 131
My dataframe is this:
data <- data.frame(column = c("word1 word2 word3", "word2 word1", "word3 word2", "word1 word2", "word3", "word1 word2"))
data
column
1 word1 word2 word3
2 word2 word1
3 word3 word2
4 word1 word2
5 word3
6 word1 word2
I want to retain the part "word1" in all rows where it occurs and remove the other parts of those strings.
My preferred output is this:
column
1 word1
2 word1
3 word3 word2
4 word1
5 word3
6 word1
I tried data$column %>% str_replace("^[word1]*", " ")
, but that didn't do what I wanted.
Upvotes: 0
Views: 172
Reputation: 8110
Another option that checks if word1
is in the column and then replaces the rest if it is.
library(tidyverse)
data <- data_frame(column = c("word1 word2 word3", "word2 word1", "word3 word2", "word1 word2", "word3", "word1 word2"))
data %>% mutate(column = if_else(grepl("word1", column), "word1", column))
#> # A tibble: 6 x 1
#> column
#> <chr>
#> 1 word1
#> 2 word1
#> 3 word3 word2
#> 4 word1
#> 5 word3
#> 6 word1
Created on 2018-08-25 by the reprex package (v0.2.0).
Upvotes: 0
Reputation: 50678
Here is a possibility
library(tidyverse)
data %>% mutate(column = str_replace(column, "^.*word1.*$", "word1"))
column
1 word1
2 word1
3 word3 word2
4 word1
5 word3
6 word1
or with a capture group
data %>% mutate(column = str_replace(column, "^.*(word1).*$", "\\1"))
Upvotes: 1