Reputation: 5254
I have a list that is the result of a row selection in a data frame. The issue is that sometimes there is no row to select and it returns a list in this form: a non-empty list with no actual content.
L <- list(combattech = character(0), damage = character(0), bonus = character(0),
range = structure(list(close = character(0), medium = character(0), far = character(0)),
row.names = integer(0), class = "data.frame"),
ammo = character(0), weight = character(0), name = character(0),
price = character(0), sf = character(0))
I want to verify if I actually have a meaningful result and not a list with all elements being empty vectors. But a list with empty vectors is not equivalent to an empty list:
length(L) == 0
#> [1] FALSE
does not give me TRUE
because the length is 9
not 0
.
Of course, I could simply check if length( which(...row selection...) )
before I pick the selection and usually I do, but in this case I do not have access to the original row indices.
all(sapply(L, length) == 0)
#> [1] FALSE
also does not work (i.e. returns FALSE
) because the nested data structure range
returns 3.
Created on 2020-06-28 by the reprex package (v0.3.0)
Upvotes: 3
Views: 385
Reputation: 5254
I did some checking and all proposed solutions work either in a positive case (L is empty) …
L0 <- list(combattech = character(0), damage = character(0), bonus = character(0),
range = structure(list(close = character(0), medium = character(0), far = character(0)),
row.names = integer(0), class = "data.frame"),
ammo = character(0), weight = character(0), name = character(0), price = character(0), sf = character(0))
all(rapply(L0, length) == 0) # Solution 1
#> [1] TRUE
all(sapply(L0, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0) # Solution 2
#> [1] TRUE
all(sapply(L0, NROW) == 0) # Solution 3
#> [1] TRUE
length(unlist(L0)) == 0 # Solution 4
#> [1] TRUE
require(purrr)
#> Lade nötiges Paket: purrr
every(L0, ~ NROW(.) == 0) # Solution 5
#> [1] TRUE
… and in the negative case (L has content)
L1 <- list(combattech = "ranged", damage = "1d", bonus = "+3",
range = structure(list(close = "20", medium = "40", far = "80"),
row.names = integer(0), class = "data.frame"),
ammo = "arrow", weight = "1.5 Stone", name = "Bow", price = "120 silver", sf = "3/5")
all(rapply(L1, length) == 0) # Solution 1
#> [1] FALSE
all(sapply(L1, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0) # Solution 2
#> [1] FALSE
all(sapply(L1, NROW) == 0) # Solution 3
#> [1] FALSE
length(unlist(L1)) == 0 # Solution 4
#> [1] FALSE
every(L1, ~ NROW(.) == 0) # Solution 5
#> [1] FALSE
Using NROW
directly - however - does not work, even when we coerce L1
into a data frame:
NROW(as.data.frame(L1)) == 0 # Solution 6 only works with empty lists
#> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : Argumente implizieren unterschiedliche Anzahl Zeilen: 1, 0
I wanted to decide on an approach based on their performance, using both cases a positive and negative example.
require(microbenchmark)
#> Lade nötiges Paket: microbenchmark
L40 <- list(combattech = rep("ranged", 40), damage = rep(paste0(1:2, "d"), each = 20), bonus = paste0("+", 1:40),
range = structure(list(close = "20", medium = "40", far = "80"), row.names = integer(0), class = "data.frame"),
ammo = rep(c("arrow", "bolt"), 20), weight = paste0(0.5*1:40, " Stone"), name = rep(c("bow", "crossbow"), 20), price = paste(seq(10, 10*40, 10), "silver"), sf = rep("3/5", 40))
microbenchmark(
unlist = {length(unlist(L0)) == 0; length(unlist(L1)) == 0; length(unlist(L40)) == 0},
rapply = {all(rapply(L0, length) == 0); all(rapply(L1, length) == 0); all(rapply(L40, length) == 0)},
NROW = {all(sapply(L0, NROW) == 0); all(sapply(L0, NROW) == 0); all(sapply(L40, NROW) == 0)},
long.one = {all(sapply(L0, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0); all(sapply(L1, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0); all(sapply(L40, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0)},
purrr = {every(L0, ~ NROW(.) == 0); every(L1, ~ NROW(.) == 0); every(L40, ~ NROW(.) == 0)},
times = 5E3)
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> unlist 81.5 83.4 84.68564 84.2 84.90 1365.7 5000
#> rapply 27.9 31.9 36.44792 34.1 35.60 6015.9 5000
#> NROW 51.3 56.0 60.63962 58.0 60.30 1657.4 5000
#> long.one 61.1 67.2 72.01368 69.4 71.90 3727.1 5000
#> purrr 97.7 108.2 116.74834 111.6 114.95 1917.5 5000
I am glad that I finally added an example with 40 rows. With only 1 row (as in L1
) the unlist
approach showed best performance, by far. But with 40 rows the situation has changed.
So, the final recommendation is:
rapply
if the list usually contains a larger number of rows and you want to filter out occasional empty lists.Created on 2020-06-28 by the reprex package (v0.3.0)
Upvotes: 0
Reputation: 39858
One purrr
solution using the basic logic provided by @user20650 and @Ronak Shah:
every(L, ~ NROW(.) == 0)
[1] TRUE
Upvotes: 1
Reputation: 269491
1) We can use rapply
to recursively walk the structure and return a flat result.
all(rapply(L, length) == 0)
## [1] TRUE
2) Another approach is to unlist
it first:
length(unlist(L)) == 0
## [1] TRUE
Upvotes: 3
Reputation: 388907
You can check if the element in the list is a dataframe and return it's row :
all(sapply(L, function(x) if(is.data.frame(x)) nrow(x) else length(x)) == 0)
#[1] TRUE
We can use NROW
as suggested by @user20650 which makes this compact.
all(sapply(L, NROW) == 0)
Upvotes: 3