Antonio Serrano
Antonio Serrano

Reputation: 942

Store variables in a list when looping

The following list "ls" contains three data frames:

unigrams = data.frame(freq = c(3, 3, 5, 4, 3, 41),
                  term = c("a-list", "a-p", "aaa", "aam", "aamir", "aaron"))
bigrams = data.frame(freq = c(13, 1, 1, 2, 1, 4),
                 term = c("a a", "a abode", "a about", "a absolutely", "a accessory", "a acre")) 
trigrams = data.frame(freq = c(1, 1, 1, 1, 1, 1),
                  term = c("a a card", "a a divorce", "a a dreamer", "a a great", "a a guy", "a a hand"))
ls = list(unigrams, bigrams, trigrams)

Which give us this:

[[1]]
  freq   term
1    3 a-list
2    3    a-p
3    5    aaa
4    4    aam
5    3  aamir
6   41  aaron

[[2]]
  freq         term
1   13          a a
2    1      a abode
3    1      a about
4    2 a absolutely
5    1  a accessory
6    4       a acre

[[3]]
  freq        term
1    1    a a card
2    1 a a divorce
3    1 a a dreamer
4    1   a a great
5    1     a a guy
6    1    a a hand

I want to separate the column "term" in each data frame by the number of words, creating the columns "word1", "word2", "word3". Like this:

  freq  word1
1    3 a-list
2    3    a-p
3    5    aaa
4    4    aam
5    3  aamir
6   41  aaron

  freq     word1        word2
1   13         a            a
2    1         a        abode
3    1         a        about
4    2         a   absolutely
5    1         a    accessory
6    4         a         acre

  freq     word1        word2        word3
1    1         a            a         card
2    1         a            a      divorce
3    1         a            a      dreamer
4    1         a            a        great
5    1         a            a          guy
6    1         a            a         hand

My try:

new_ls = list()
for (i in length(ls)) {
    x = ls[[i]]
    # Split each word in column "term":
    x[,paste("word", 1:i, sep = "")] = as.character(lapply(strsplit(as.character(x$term), split=" "), "[", i))
    x = subset(x, select = -term)
    new_ls[[i]] = x
}

Unfortunately, this last snippet only stores some wrong result in the last element:

[[1]]
NULL

[[2]]
NULL

[[3]]
  freq   word1   word2   word3
1    1    card    card    card
2    1 divorce divorce divorce
3    1 dreamer dreamer dreamer
4    1   great   great   great
5    1     guy     guy     guy
6    1    hand    hand    hand

What am I doing wrong?

Upvotes: 2

Views: 44

Answers (1)

Sotos
Sotos

Reputation: 51582

splitstackshape library makes this task easy,

library(splitstackshape)
lapply(ls, function(i) cSplit(i, 'term', sep = ' ', direction = 'wide'))

Upvotes: 1

Related Questions