Reputation: 3994
I have a dataset that has something of the following:
ID Type Count
1 **Radisson** 8
2 **Renaissance** 9
3 **Hilton** New York Only 8
4 **Radisson** East Cost 8
I want to get a dataset that looks like
ID Type Count
1 **Radisson** 8
2 **Renaissance** 9
3 **Hilton** 8
4 **Radisson** 8
Or even without the * if at all possible.
Any solutions?
Upvotes: 3
Views: 46
Reputation: 886948
Here is an option with str_extract
library(stringr)
library(dplyr)
df %>%
mutate(Type = str_extract(Type, "[*]*[^*]*[*]*"))
# Type Count
#1 **Radisson** 8
#2 **Renaissance** 9
#3 **Hilton** 8
#4 **Radisson** 8
Upvotes: 0
Reputation: 20085
A solution is to use strsplit
on **
and pick 2nd element:
df$Type = sapply(strsplit(df$Type, split= "\\*{2}"), function(x)x[2])
df
# ID Type Count
# 1 1 Radisson 8
# 2 2 Renaissance 9
# 3 3 Hilton 8
# 4 4 Radisson 8
Upvotes: 0
Reputation: 5893
You could just sub out everything that isn't between the stars in the beginning.
df <- data.frame(Type = c("**Radisson**", "**Renaissance**", "**Hilton** New York Only",
"**Radisson** East Cost"),
Count = c(8, 9, 8, 8))
gsub("^(\\*{2}.*\\*{2}).*", "\\1", df$Type, perl = TRUE)
[1] "**Radisson**" "**Renaissance**" "**Hilton**" "**Radisson**"
So ...
df$Type <- gsub("^(\\*{2}.*\\*{2}).*", "\\1", df$Type, perl = TRUE)
df
Type Count
1 **Radisson** 8
2 **Renaissance** 9
3 **Hilton** 8
4 **Radisson** 8
Upvotes: 3