Reputation: 463
I am translating looping for()
operations into apply()
family ones: (sapply
, lapply
, mapply
, etc.) to gain performance in all my R code. I have a list of lists named lt
with the following look:
lt
$`11141550000`
y
2016-02-05 18
2016-03-03 8
2016-03-30 6
2016-04-26 0
$`11140780000`
y
2016-03-25 2
2016-03-30 0
2016-04-04 0
2016-04-09 0
2016-04-14 0
$`11141550000`
y
2016-02-05 18
2016-03-03 8
2016-07-16 10
2016-08-12 10
One chunk of my code is extremely slow (I know for
operations are not efficient in R and should be avoided if you want to become more pro in this language). I coded before a chunk like this:
for (i in 1:length(lt)){
lt[[i]] <- lt[[i]][as.Date(index(lt[[i]]), format = "%Y-%m-%d") < "2018-11-01"]
}
I am trying to translate this for
into a fast Filter
or sapply
, lapply
operation, where every value of each list must be before the date "2018-11-01"
. Nevertheless, I have not been capable of:
f <- function(i){
lt[[i]][as.Date(index(lt[[i]]), format = "%Y-%m-%d") < "2018-11-01"]
}
lapply(lt, function(x) f(x))
But received error:
Error in lt[[i]] : recursive indexing failed at level 2
f <- function(i){
lt[[i]][as.Date(index(lt[[i]]), format = "%Y-%m-%d") < "2018-11-01"]
}
Filter(function(x) f, lt)
But I received this message:
Error in Filter(function(x) f, lt) :
(list) object cannot be coerced to type 'logical'
Filter(f, lt)
But again, I received an error:
Error in lt[[i]] : recursive indexing failed at level 2
I would appreciate any help in translating this for
operation, as I need understanding better the dynamic of apply
functions.
Thanks to J.Gourlay feedback I am adding a sample of my list with dput
:
> dput(lt)
structure(list(`11140780000` = structure(c(2, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 4, 0, 0, 0, 0, 0, 0, 1
), .Dim = c(132L, 1L), .Dimnames = list(NULL, "y"), index = structure(c(1458864000,
1459296000, 1459728000, 1460160000, 1460592000, 1461024000, 1461456000,
1461888000, 1462320000, 1462752000, 1463184000, 1463616000, 1464048000,
1464480000, 1464912000, 1465344000, 1465776000, 1466208000, 1466640000,
1467072000, 1467504000, 1467936000, 1468368000, 1468800000, 1469232000,
1469664000, 1470096000, 1470528000, 1470960000, 1471392000, 1471824000,
1472256000, 1472688000, 1473120000, 1473552000, 1473984000, 1474416000,
1474848000, 1475280000, 1475712000, 1476144000, 1476576000, 1477008000,
1477440000, 1477872000, 1478304000, 1478736000, 1479168000, 1479600000,
1480032000, 1480464000, 1480896000, 1481328000, 1481760000, 1482192000,
1482624000, 1483056000, 1483142400, 1483574400, 1484006400, 1484438400,
1484870400, 1485302400, 1485734400, 1486166400, 1486598400, 1487030400,
1487462400, 1487894400, 1488326400, 1488758400, 1489190400, 1489622400,
1490054400, 1490486400, 1490918400, 1491350400, 1491782400, 1492214400,
1492646400, 1493078400, 1493510400, 1493942400, 1494374400, 1494806400,
1495238400, 1495670400, 1496102400, 1496534400, 1496966400, 1497398400,
1497830400, 1498262400, 1498694400, 1499126400, 1499558400, 1499990400,
1500422400, 1500854400, 1501286400, 1501718400, 1502150400, 1502582400,
1503014400, 1503446400, 1503878400, 1504310400, 1504742400, 1505174400,
1505606400, 1506038400, 1506470400, 1506902400, 1507334400, 1507766400,
1508198400, 1508630400, 1509062400, 1509494400, 1509926400, 1510358400,
1510790400, 1511222400, 1511654400, 1512086400, 1512518400, 1512950400,
1513382400, 1513814400, 1514246400, 1514678400, 1514764800), tzone = "UTC", tclass = "Date"), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC", class = c("xts",
"zoo")), `11141550000` = structure(c(18, 8, 6, 0, 4, 8, 10, 10,
0, 23, 0, 8, 0, 2, 14, 16, 20, 4, 4, 4), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC", class = c("xts",
"zoo"), index = structure(c(1454630400, 1456963200, 1459296000,
1461628800, 1463961600, 1466294400, 1468627200, 1470960000, 1473292800,
1475625600, 1477958400, 1480291200, 1482624000, 1483142400, 1483833600,
1486166400, 1488499200, 1490832000, 1493164800, 1495497600), tzone = "UTC", tclass = "Date"), .Dim = c(20L,
1L), .Dimnames = list(NULL, "y"))), .Names = c("11140780000",
"11141550000"))
As per suggestion of Jozef, I confirm that I have used library zoo
to reproduce my code. I.e. function index
of zoo
package in the for
loop.
Upvotes: 1
Views: 442
Reputation: 107567
Essentially, you are passing an object that expects an integer index in your lapply
. Consider wrapping lt in seq_along(lt)
for the input of lapply
. Also, remember unlike for
loops, apply functions return objects. So assign lapply
to an object. Also, when function accepts one non-optional argument, there is no need for specifying the function
operator.
f <- function(i){
lt[[i]][as.Date(index(lt[[i]]), format = "%Y-%m-%d") < "2018-11-01"]
}
new_lt <- lapply(seq_along(lt), f)
If your for
loop works as you say, new_lt should be exactly the same as lt after running for
loop:
all.equal(lt_after_for_loop, new_lt)
# [1] TRUE
identical(lt_after_for_loop, new_lt)
# [1] TRUE
Alternatively, pass whole objects instead of using indexing with adjusted defined function:
f <- function(obj){
obj[as.Date(index(obj), format = "%Y-%m-%d") < "2018-11-01"]
}
new_lt <- lapply(lt, f)
Finally, Filter()
works to filter (keep or remove) a list's object by logical condition, not the contents within each item of list unless you are using such inner contents to decide the top level item to be removed from list. But for
and lapply
does not exclude items during processing (i.e., same number of items before/after their calls).
Upvotes: 1