MKP
MKP

Reputation: 21

extract words from a string into different strings

I'm very new with coding, and I have to clean a table with string variables. One of the columns I'm trying to clean includes several variables in itself. So if I take one row from my column it looks like this

string<- ("'casual': True,'classy': False,'divey': False,'hipster': False,'intimate': False,'romantic': False,'touristy': False,'trendy': False,'upscale': False")

I'm trying to extract Boolean values for each of the categories into separate columns.So my outcome should have 9 columns(each for every category) and rows should include True/ False values.

What am I supposed to use in this case?

Upvotes: 2

Views: 42

Answers (1)

akrun
akrun

Reputation: 887048

An option is to use str_extract_all to extract the word (\\w+) that succeeds a a space followed by a :

library(stringr)
as.logical(str_extract_all(string, "(?<=: )\\w+")[[1]])
#[1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

If we need to parse into a data.frame, it would be better to use fromJSON from jsonlite

library(jsonlite)
lst1 <- fromJSON(paste0("{", gsub("'", "", gsub("\\b(\\w+)\\b",
           '"\\1"', string)), "}"))

data.frame(lapply(lst1, as.logical))
#   casual classy divey hipster intimate romantic touristy trendy upscale
#1   TRUE  FALSE FALSE   FALSE    FALSE    FALSE    FALSE  FALSE   FALSE

Or in base R

as.logical(regmatches(string, gregexpr("(?<=: )\\w+", string, perl = TRUE))[[1]])

Upvotes: 1

Related Questions