Extracting words from multiple strings in r

Question

I have the following string as an example in one column in my data frame:

 A = [{'name': 'Bank', 'id': 559}, {'name': 'Cinema', 'id': 2550}, {'name': 'Shopping', 'id': 10201}]

I have tried the following code to extract words ('Bank','Cinema','Shopping') from this string but is giving me 'character(0)':

 stringr::str_extract_all(A, "\w+(?='\})")

May I ask how can do this task?

Gopala · Accepted Answer

Since that is straight JSON, you can do something like this:

library(jsonlite)

A <- "[{'name': 'Bank', 'id': 559}, {'name': 'Cinema', 'id': 2550}, {'name': 'Shopping', 'id': 10201}]"
A <- gsub("'", '"', A) # fromJSON expects double quotes.

l <- fromJSON(A)
l$name

EDIT: Assuming you have a column with multiple JSON arrays like A and not just one JSON array as you showed above in your question, you will need to do something like this:

df <- data_frame(A = rep("[{'name': 'Bank', 'id': 559}, {'name': 'Cinema', 'id': 2550}, {'name': 'Shopping', 'id': 10201}]", 5))

df$A <- gsub("'", '"', df$A)
lapply(df$A, function(x) {j <- fromJSON(x); j$name})

I just repeated the same JSON array string you provided five times to create a 5-row data frame. Then, use lapply on each 'row' to get results from it.

Extracting words from multiple strings in r

Answers (2)

Related Questions