Reputation: 6669
Suppose we have data like this:
#some data
x = data.frame(
letters = factor(c("a", "c"), levels = letters[1:4])
I.e., we have levels b and d of a factor that doesn't appear in the data. We can loop over the groups of letters
#loop inside
plyr::ddply(x, "letters", function(xx) {
#do something here
if (xx$letters == "b") print("do something")
count = nrow(xx)
gives us:
letters count
1 a 1
2 c 1
So we are missing the b and d levels. We then add drop = F
to not skip them:
plyr::ddply(x, "letters", .drop = F, function(xx) {
#do something here
#if (xx$letters == "b") print("do something")
count = nrow(xx)
we get:
letters count
1 a 1
2 b 0
3 c 1
4 d 0
However, suppose we want to do something inside the loop based on the letter group. We want to do something when we get the empty b group. However, we don't know when we are inside it. If we add if (nrow(xx)==0) browser()
, we can look at xx
[1] letters
<0 rows> (or 0-length row.names)
But we can't tell whether it is b or d. Is it possible to find out?
Upvotes: 0
Views: 54
Reputation: 6669
Yes, it can be done with fancy lookup. To figure it out, call browser()
inside the loop and inspect the environment for objects with ls()
Called from: .fun(piece, ...)
Browse[1]> c
Called from: .fun(piece, ...)
Browse[1]> xx
[1] letters
<0 rows> (or 0-length row.names)
Browse[1]> ls(all.names = T)
[1] "xx"
So there is nothing here except for the empty data frame piece (subset of original data). It would have been nice if there was a hidden object here to indicate the piece but alas. However, we can look at the parent environments and see if we get lucky:
Browse[1]> ls(all.names = T, envir = parent.frame(1))
[1] "i" "piece"
Browse[1]> ls(all.names = T, envir = parent.frame(2))
[1] "..." ".data" ".fun" ".inform" ".parallel" ".paropts" ".progress" "do.ply" "n" "pieces" "progress"
[12] "result"
OK, there is definitely something in them. One can fetch these using get()
or mget()
for multiple at a time:
Browse[1]> mget(ls(envir = parent.frame(1)), envir = parent.frame(1))
[1] 2
[1] letters
<0 rows> (or 0-length row.names)
Browse[1]> mget(ls(envir = parent.frame(2)), envir = parent.frame(2))
function (i)
piece <- pieces[[i]]
if (.inform) {
res <- try(.fun(piece, ...))
if (inherits(res, "try-error")) {
piece <- paste(utils::capture.output(print(piece)),
collapse = "\n")
stop("with piece ", i, ": \n", piece, call. = FALSE)
else {
res <- .fun(piece, ...)
<bytecode: 0x559669467ca8>
<environment: 0x55966c7c6798>
[1] 4
1 a
[1] letters
<0 rows> (or 0-length row.names)
1 c
[1] letters
<0 rows> (or 0-length row.names)
function (x)
<bytecode: 0x559669453cd0>
<environment: 0x55966e5c8b50>
function ()
<bytecode: 0x559669453e58>
<environment: 0x55966e5c8b50>
function ()
<bytecode: 0x559669453e58>
<environment: 0x55966e5c8b50>
So we see that i
in parent.frame(1)
is the current subset count, and the names on pieces in parent.frame(2)
has the levels we want. Putting them together, we can get the current level:
plyr::ddply(x, "letters", .drop = F, function(xx) {
#figure out the piece
i = get("i", envir = parent.frame(1))
levels = names(get("pieces", envir = parent.frame(1)))
current_piece = levels[i]
#do something
if (current_piece == "b") print("this is the b empty group!") else print("This is not level b")
count = nrow(xx)
which results in:
[1] "This is not level b"
[1] "this is the b empty group!"
[1] "This is not level b"
[1] "This is not level b"
letters count
1 a 1
2 b 0
3 c 1
4 d 0
Upvotes: 0