Reputation: 3432

Transforme a column into a list inside a dataframe

I have a dataframe such as

COL1 COL2 
A    "[Lasius_niger]" 
B    "[Canis_lupus,Feis_cattus]"
C    "[Cattus_stigmatizans,Cattus_cattus"]
D    "[Apis_mellifera]"

and in my code I iterate each row of df$COL2 into a commande where I need that the cotent is a list. So I need to transforme the df$COL2 into a list inside the dataframe

So I should get something like that I guess:

COL1 COL2 
A    "Lasius_niger" 
B    "Canis_lupus","Feis_cattus"
C    "Cattus_stigmatizans","Cattus_cattus"
D    "Apis_mellifera"

does someone have an idea ?

Upvotes: 0

Answers (3)

akrun

Reputation: 887831

We can also use str_extract_all

library(stringr)
df$COL2 <- str_extract_all(df$COL2, "\\w+")

Or another option from qdapRegex

library(qdapRegex)
rm_square(df$COL2, extract = TRUE)

Upvotes: 0

B. Christian Kamgang

Reputation: 6529

You can also use the function stri_extract_all_words in the stringi package as follows

df$COL2 <- stringi::stri_extract_all_words(df$COL2)

str(df)
#'data.frame':  4 obs. of  2 variables:
# $ COL1: chr  "A" "B" "C" "D"
# $ COL2:List of 4
#  ..$ : chr "Lasius_niger"
#  ..$ : chr  "Canis_lupus" "Feis_cattus"
#  ..$ : chr  "Cattus_stigmatizans" "Cattus_cattus"
#  ..$ : chr "Apis_mellifera"

Upvotes: 0

Ronak Shah

Reputation: 389235

Remove opening and closing square brackets using gsub and split string on comma.

df$COL2 <- strsplit(gsub('\\[|\\]', '', df$COL2), ',')
str(df)
#'data.frame':  4 obs. of  2 variables:
# $ COL1: chr  "A" "B" "C" "D"
# $ COL2:List of 4
#  ..$ : chr "Lasius_niger"
#  ..$ : chr  "Canis_lupus" "Feis_cattus"
#  ..$ : chr  "Cattus_stigmatizans" "Cattus_cattus"
#  ..$ : chr "Apis_mellifera"

data

df <- structure(list(COL1 = c("A", "B", "C", "D"), COL2 = c("[Lasius_niger]", 
"[Canis_lupus,Feis_cattus]", "[Cattus_stigmatizans,Cattus_cattus]", 
"[Apis_mellifera]")), class = "data.frame", row.names = c(NA, -4L))

Upvotes: 2

Transforme a column into a list inside a dataframe

Answers (3)

Related Questions