NuValue
NuValue

Reputation: 463

Filtering list of list values in R without using for

I am translating looping for() operations into apply() family ones: (sapply, lapply, mapply, etc.) to gain performance in all my R code. I have a list of lists named lt with the following look:

lt

$`11141550000`
            y
2016-02-05 18
2016-03-03  8
2016-03-30  6
2016-04-26  0

$`11140780000`
           y
2016-03-25 2
2016-03-30 0
2016-04-04 0
2016-04-09 0
2016-04-14 0

$`11141550000`
            y
2016-02-05 18
2016-03-03  8
2016-07-16 10
2016-08-12 10

One chunk of my code is extremely slow (I know for operations are not efficient in R and should be avoided if you want to become more pro in this language). I coded before a chunk like this:

for (i in 1:length(lt)){
  lt[[i]] <- lt[[i]][as.Date(index(lt[[i]]), format = "%Y-%m-%d") < "2018-11-01"]
}

I am trying to translate this for into a fast Filter or sapply, lapply operation, where every value of each list must be before the date "2018-11-01". Nevertheless, I have not been capable of:

1st attempt:

f <- function(i){
  lt[[i]][as.Date(index(lt[[i]]), format = "%Y-%m-%d") < "2018-11-01"]
}
lapply(lt, function(x) f(x))

But received error:

 Error in lt[[i]] : recursive indexing failed at level 2 

2nd attempt:

f <- function(i){
  lt[[i]][as.Date(index(lt[[i]]), format = "%Y-%m-%d") < "2018-11-01"]
}
Filter(function(x) f, lt)

But I received this message:

Error in Filter(function(x) f, lt) : 
  (list) object cannot be coerced to type 'logical'

3rd Attempt:

Filter(f, lt)

But again, I received an error:

Error in lt[[i]] : recursive indexing failed at level 2

I would appreciate any help in translating this for operation, as I need understanding better the dynamic of apply functions.

P.D.1.

Thanks to J.Gourlay feedback I am adding a sample of my list with dput:

> dput(lt)
structure(list(`11140780000` = structure(c(2, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 4, 0, 0, 0, 0, 0, 0, 1
), .Dim = c(132L, 1L), .Dimnames = list(NULL, "y"), index = structure(c(1458864000, 
1459296000, 1459728000, 1460160000, 1460592000, 1461024000, 1461456000, 
1461888000, 1462320000, 1462752000, 1463184000, 1463616000, 1464048000, 
1464480000, 1464912000, 1465344000, 1465776000, 1466208000, 1466640000, 
1467072000, 1467504000, 1467936000, 1468368000, 1468800000, 1469232000, 
1469664000, 1470096000, 1470528000, 1470960000, 1471392000, 1471824000, 
1472256000, 1472688000, 1473120000, 1473552000, 1473984000, 1474416000, 
1474848000, 1475280000, 1475712000, 1476144000, 1476576000, 1477008000, 
1477440000, 1477872000, 1478304000, 1478736000, 1479168000, 1479600000, 
1480032000, 1480464000, 1480896000, 1481328000, 1481760000, 1482192000, 
1482624000, 1483056000, 1483142400, 1483574400, 1484006400, 1484438400, 
1484870400, 1485302400, 1485734400, 1486166400, 1486598400, 1487030400, 
1487462400, 1487894400, 1488326400, 1488758400, 1489190400, 1489622400, 
1490054400, 1490486400, 1490918400, 1491350400, 1491782400, 1492214400, 
1492646400, 1493078400, 1493510400, 1493942400, 1494374400, 1494806400, 
1495238400, 1495670400, 1496102400, 1496534400, 1496966400, 1497398400, 
1497830400, 1498262400, 1498694400, 1499126400, 1499558400, 1499990400, 
1500422400, 1500854400, 1501286400, 1501718400, 1502150400, 1502582400, 
1503014400, 1503446400, 1503878400, 1504310400, 1504742400, 1505174400, 
1505606400, 1506038400, 1506470400, 1506902400, 1507334400, 1507766400, 
1508198400, 1508630400, 1509062400, 1509494400, 1509926400, 1510358400, 
1510790400, 1511222400, 1511654400, 1512086400, 1512518400, 1512950400, 
1513382400, 1513814400, 1514246400, 1514678400, 1514764800), tzone = "UTC", tclass = "Date"), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC", class = c("xts", 
"zoo")), `11141550000` = structure(c(18, 8, 6, 0, 4, 8, 10, 10, 
0, 23, 0, 8, 0, 2, 14, 16, 20, 4, 4, 4), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC", class = c("xts", 
"zoo"), index = structure(c(1454630400, 1456963200, 1459296000, 
1461628800, 1463961600, 1466294400, 1468627200, 1470960000, 1473292800, 
1475625600, 1477958400, 1480291200, 1482624000, 1483142400, 1483833600, 
1486166400, 1488499200, 1490832000, 1493164800, 1495497600), tzone = "UTC", tclass = "Date"), .Dim = c(20L, 
1L), .Dimnames = list(NULL, "y"))), .Names = c("11140780000", 
"11141550000"))

P.D.2.

As per suggestion of Jozef, I confirm that I have used library zoo to reproduce my code. I.e. function index of zoo package in the for loop.

Upvotes: 1

Views: 442

Answers (1)

Parfait
Parfait

Reputation: 107567

Essentially, you are passing an object that expects an integer index in your lapply. Consider wrapping lt in seq_along(lt) for the input of lapply. Also, remember unlike for loops, apply functions return objects. So assign lapply to an object. Also, when function accepts one non-optional argument, there is no need for specifying the function operator.

f <- function(i){
  lt[[i]][as.Date(index(lt[[i]]), format = "%Y-%m-%d") < "2018-11-01"]
}

new_lt <- lapply(seq_along(lt), f)

If your for loop works as you say, new_lt should be exactly the same as lt after running for loop:

all.equal(lt_after_for_loop, new_lt)
# [1] TRUE

identical(lt_after_for_loop, new_lt)
# [1] TRUE

Alternatively, pass whole objects instead of using indexing with adjusted defined function:

f <- function(obj){
  obj[as.Date(index(obj), format = "%Y-%m-%d") < "2018-11-01"]
}

new_lt <- lapply(lt, f)

Finally, Filter() works to filter (keep or remove) a list's object by logical condition, not the contents within each item of list unless you are using such inner contents to decide the top level item to be removed from list. But for and lapply does not exclude items during processing (i.e., same number of items before/after their calls).

Upvotes: 1

Related Questions