Omar Gonzales
Omar Gonzales

Reputation: 4008

R: add new column to a list of data frames with lapply

i've read this and have created a lapply function to add a "SubCat" column to every data frame within a list.

This is my code:

    my_list <- lapply(1:length(my_list),     
               function(i) cbind(my_list[[i]], my_list[[i]]["SubCat"] <- as.character(""))) 

But get this error:

Error in `[<-.data.frame`(`*tmp*`, "SubCat", value = "") : 
  replacement has 1 row, data has 0 

Whats wrong?

When i use it, on a single data frame it works:

my_list[[1]]["SubCat"] <- as.character("")

UPDATE:

These are examples of my data frames, they all contain the same structure. One column for SKU and one for the Category.

DataFrame 1:

    row.names       SKU         Tv.y.Video
1   1699        2018143169254P  Tv.y.Video
2   1700        2018143169254   Tv.y.Video
3   1946        2018144678120P  Tv.y.Video
4   1947        2018144678120   Tv.y.Video
5   2366        2018146411831P  Tv.y.Video
6   2367        2018146411831   Tv.y.Video

DataFrame 2:

    row.names   SKU             C�mputo
1     6       2004121460000P    C�mputo
2     7       2004121460000     C�mputo
3     8       2004121440002P    C�mputo
4     9       2004121440002     C�mputo
5     10      2004123030003P    C�mputo
6     11      2004123030003     C�mputo

When i applied my code to just one df it works:

my_list[[1]]["SubCat"] <- as.character("")

Result:

    row.names       SKU         Tv.y.Video    SubCat
1   1699        2018143169254P  Tv.y.Video   
2   1700        2018143169254   Tv.y.Video   
3   1946        2018144678120P  Tv.y.Video   
4   1947        2018144678120   Tv.y.Video   
5   2366        2018146411831P  Tv.y.Video   
6   2367        2018146411831   Tv.y.Video   

UPDATE 1:

I also have some empty data.frames in the list.

Upvotes: 7

Views: 15738

Answers (1)

anon
anon

Reputation:

It's because my_list[[1]]["SubCat"] <- as.character("") doesn't return anything, so, after the expression is evaluated you have NULL as data and the cbind process cannot execute it accordingly. Also, lapply will execute your function for each and every data frame in your list, so your command should like as follows:

vec.1 <- c(1, 2)
vec.2 <- c(2, 3)
df.1 <- data.frame(vec.1, vec.2)
df.2 <- data.frame(vec.2, vec.1)
my_list <- list(df.1, df.2)
## This is the correct use of lapply for your list
my_list <- lapply(my_list, cbind, SubCat = c(""))
my_list
[[1]]
  vec.1 vec.2 SubCat
1     1     2       
2     2     3       

[[2]]
  vec.2 vec.1 SubCat
1     2     1       
2     3     2  

EDIT: lapply takes a list as argument and a function to apply in each and every one of the list's elements. However, cbind requires two arguments. The additional arguments are passed with lapply. Now, you may notice that the SubCat vector consists of one null string; that is OK, because R repeats that vector as many times as needed.

EDIT #2: Hmm, this error is probably coming from the empty data.frames, which I didn't take into account. You could do this to solve your problem (I didn't take into account that a vector cannot repeat itself zero times):

my_list <- lapply(my_list, function(df){
    if (nrow(df) > 0)
        cbind(df, SubCat = c(""))
    else
        cbind(df, SubCat = character())
    })

Added by the author of the question:

It is OK, to complete a column with blanks("") according to the other columns SubCat = c(""). But, if you have an empty data frame, you need to start a new column with: SubCat = character(), which is a zero length column.

Upvotes: 5

Related Questions