Reputation: 57
For example I have a column where the characters go something like this
"Object=house colour=blue size=big", "Object=roof colour=red size=small", "Object=window colour=green size=medium"
I want to extract just the word after colour and create a new column. so it would look like this
"blue", "red", "green"
I have started trying to do this with str_extract
but am very lost on how to specify things. So far I have
colour<- str_extract(string = df, pattern = "(?<=colour= ).*(?=\\,)")
How would I go about solving this?
Upvotes: 1
Views: 57
Reputation: 5788
Base R solution:
df <- c(
"Object=house colour=blue size=big",
"Object=roof colour=red size=small",
"Object=window colour=green size=medium"
)
res <- data.frame(
object_info = df,
object_colour = gsub(
".*\\s+colour\\=(\\S+).*",
"\\1",
df
),
row.names = NULL,
stringsAsFactors = FALSE
)
res
Upvotes: 0
Reputation: 79218
You can also convert to dcf, and turn this to a data.frame:
read.dcf(textConnection(paste(chartr("= ", ":\n", text), collapse = "\n\n")), all =TRUE)
Object colour size
1 house blue big
2 roof red small
3 window green medium
Then you can select the column you want
Upvotes: 1
Reputation: 887108
There was no space after the =
and also, we can use \\S+
to specify one or more non-white space
library(stringr)
str_extract(string = df, pattern = "(?<=colour=)\\S+")
[1] "blue" "red" "green"
df <- c("Object=house colour=blue size=big", "Object=roof colour=red size=small",
"Object=window colour=green size=medium")
Upvotes: 2
Reputation: 25323
A possible solution:
library(tidyverse)
c("Object=house colour=blue size=big", "Object=roof colour=red size=small", "Object=window colour=green size=medium") %>%
str_extract("(?<=colour\\=)\\S+")
#> [1] "blue" "red" "green"
Upvotes: 1