How can a dataframe containing a vector list have a different length when unlisted?

Question

Given a data frame with n length, I ran an apply function and assigned the result to a new column on the dataframe:

my_df$index <- sapply(my_df$local_db_uuid,function(x) which(my_df$remote_db_uuid== x))

However, I noticed the following:

join_ref_id_complete$index %>% length()
# returns length of dataframe rows

Versus:

join_ref_id_complete$index %>% unlist() %>% length()
# returns less than length of dataframe rows

What is going on here with the length? Are these missing values?

akrun · Accepted Answer

It is possible that some list elements doesn't have a match and returns integer(0) which gets dropped while unlisting. Using a simple example

lst1 <- list(c(5, 0), c(3, 2, 4), 5)
sapply(lst1, function(x) which(x == 5))
#[[1]]
#[1] 1

#[[2]]
#integer(0)

#[[3]]
#[1] 1

when we unlist, the second element is dropped

unlist(sapply(lst1, function(x) which(x == 5)))
#[1] 1 1

returning a length of 2 instead of 3

But, it is just a coincidence that the length is less. It can be greater ass well

lst1 <- list(c(5, 0, 5, 5), c(3, 2, 4), c(5, 3, 5))
unlist(sapply(lst1, function(x) which(x == 5)))
#[1] 1 3 4 1 3

Here, the length is 5 i.e. more than the length of the list. It could also be equal just by coincidence

Answers (1)