xtufer
xtufer

Reputation: 169

dplyr unnest() not working for large comma separated data

Trying to use dplyr's unnest function to split apart a large character data set separated by commas. The data set has the form:

id                                     keywords
 835a24fe-c276-9824-0f4d-35fc81319cca  Analytics,Artificial Intelligence,Big Data,Health Care

I want to create a table that has the "id" in column one and each of the "keywords" in a separate column with the same "id"

I'm using the code:

CB_keyword <- tibble(id=organizations$uuid[organizations$uuid %in% org_uuid ] , 
                     keyword=organizations$category_list[organizations$uuid %in% org_uuid]) %>% unnest(keyword, names_sep = ",")

The %in% code is selecting "id" and "keyword" info from another table ... and it is doing this correctly. The piping to unnest seems to do nothing. The tibble remains unchanged except that the column name is now "keyword,keyword" instead of "keyword", but the data is the same as if the unnest command is not used.

Upvotes: 2

Views: 556

Answers (1)

akrun
akrun

Reputation: 887501

If the keywords is a string column, use separate_rows instead of unnest

library(dplyr)
library(tidyr)
df1 %>%
    separate_rows(keywords, sep=",\\s*")

Upvotes: 3

Related Questions