user29651977
user29651977

Reputation: 19

How do I modify a <list> dataframe column?

I'm working in RStudio, trying to clean up a dataset of Pokémon I converted from a json file, and I've got this dataframe named bp1:

        species         item            ability         
        <chr>           <list>          <chr>           

1   Aegislash   <chr [2]>   Stance Change       
2   Aegislash   <chr [1]>   Stance Change       
3   Aegislash   <chr [1]>   Stance Change       
4   Aegislash   <chr [1]>   Stance Change       
5   Aegislash   <chr [1]>   Stance Change

If I run

bp1$item 

I get this as console output:

[[1]]
[1] "Weakness Policy" "Ghostium Z"

[[2]]
[1] "Life Orb"

[[3]]
[1] "Ghostium Z"

[[4]]
[1] "Focus Sash"

[[5]]
[1] "Leftovers"

What I want is to modify bp1 so that I get something like this:

        species         item1             item2          ability        
        <chr>           <char>            <char>         <chr>          

1   Aegislash   Weakness Policy   Ghostium Z     Stance Change      
2   Aegislash   Life Orb      NA             Stance Change      
3   Aegislash   Ghostium Z    NA             Stance Change      
4   Aegislash   Focus Sash    NA             Stance Change      
5   Aegislash   Leftovers     NA             Stance Change

Although the number of rows here is small enough that I could just make the new dataframe by hand, bp1 is basically just a subset of my data so I need a solution that can generalize. I guess the number of item columns could be whatever the length of the longest character list in the original item column is, if that makes sense.

So far I've been trying to turn each character list in the item column into a string, so that the item column turns into a column. So in the first row Weakness Policy and Ghostium Z would be combined into 1 string, maybe with a comma or something as a separator like this:

item
<char>
Weakness Policy, Ghostium Z
Life Orb
Ghostium Z
Focus Sash
Leftovers

Because I think if I get it to that point I might have some code that can split the column for me the way I want it to be split.

I've tried some stuff involving lapply() and paste():

bp1$item<-lapply(bp1it, function(y) paste(y))
bp1%>%
  mutate(item=as.character(item))%>%
  mutate(item=paste(item, collapse = ';'))
bp1$item<-lapply(bp1$item, as.character)

But I haven't managed to get the item column to not have any list elements.

Upvotes: 1

Views: 93

Answers (1)

jay.sf
jay.sf

Reputation: 73562

lengths() on the item column gives the number of items for each observation; you're looking for the max. The `length<-`() function can be used to adapt the length of an object; using it on the list elements of the item column will fill the elements up to the maximum with NA. Doin this in an sapply will convert automatically to a matrix, which, after transposing, can be easily cbinded.

> cbind(bp1[-2], item=t(sapply(bp1$item, `length<-`, max(lengths(bp1$item)))))
    species       ability          item.1     item.2
1 Aegislash Stance Change Weakness Policy Ghostium Z
2 Aegislash Stance Change        Life Orb       <NA>
3 Aegislash Stance Change      Ghostium Z       <NA>
4 Aegislash Stance Change      Focus Sash       <NA>
5 Aegislash Stance Change       Leftovers       <NA>

Data:

> dput(bp1)
structure(list(species = c("Aegislash", "Aegislash", "Aegislash", 
"Aegislash", "Aegislash"), item = list(c("Weakness Policy", "Ghostium Z"
), "Life Orb", "Ghostium Z", "Focus Sash", "Leftovers"), ability = c("Stance Change", 
"Stance Change", "Stance Change", "Stance Change", "Stance Change"
)), row.names = c(NA, -5L), class = "data.frame")

Upvotes: 1

Related Questions