Reputation: 6669
Suppose we have data like this:
library(plyr)
#some data
x = data.frame(
letters = factor(c("a", "c"), levels = letters[1:4])
)
I.e., we have levels b and d of a factor that doesn't appear in the data. We can loop over the groups of letters
:
#loop inside
plyr::ddply(x, "letters", function(xx) {
#do something here
if (xx$letters == "b") print("do something")
data.frame(
count = nrow(xx)
)
})
gives us:
letters count
1 a 1
2 c 1
So we are missing the b and d levels. We then add drop = F
to not skip them:
plyr::ddply(x, "letters", .drop = F, function(xx) {
#do something here
#if (xx$letters == "b") print("do something")
data.frame(
count = nrow(xx)
)
})
we get:
letters count
1 a 1
2 b 0
3 c 1
4 d 0
However, suppose we want to do something inside the loop based on the letter group. We want to do something when we get the empty b group. However, we don't know when we are inside it. If we add if (nrow(xx)==0) browser()
, we can look at xx
object:
[1] letters
<0 rows> (or 0-length row.names)
But we can't tell whether it is b or d. Is it possible to find out?
Upvotes: 0
Views: 54
Reputation: 6669
Yes, it can be done with fancy lookup. To figure it out, call browser()
inside the loop and inspect the environment for objects with ls()
:
Called from: .fun(piece, ...)
Browse[1]> c
Called from: .fun(piece, ...)
Browse[1]> xx
[1] letters
<0 rows> (or 0-length row.names)
Browse[1]> ls(all.names = T)
[1] "xx"
So there is nothing here except for the empty data frame piece (subset of original data). It would have been nice if there was a hidden object here to indicate the piece but alas. However, we can look at the parent environments and see if we get lucky:
Browse[1]> ls(all.names = T, envir = parent.frame(1))
[1] "i" "piece"
Browse[1]> ls(all.names = T, envir = parent.frame(2))
[1] "..." ".data" ".fun" ".inform" ".parallel" ".paropts" ".progress" "do.ply" "n" "pieces" "progress"
[12] "result"
OK, there is definitely something in them. One can fetch these using get()
or mget()
for multiple at a time:
Browse[1]> mget(ls(envir = parent.frame(1)), envir = parent.frame(1))
$i
[1] 2
$piece
[1] letters
<0 rows> (or 0-length row.names)
Browse[1]> mget(ls(envir = parent.frame(2)), envir = parent.frame(2))
$do.ply
function (i)
{
piece <- pieces[[i]]
if (.inform) {
res <- try(.fun(piece, ...))
if (inherits(res, "try-error")) {
piece <- paste(utils::capture.output(print(piece)),
collapse = "\n")
stop("with piece ", i, ": \n", piece, call. = FALSE)
}
}
else {
res <- .fun(piece, ...)
}
progress$step()
res
}
<bytecode: 0x559669467ca8>
<environment: 0x55966c7c6798>
$n
[1] 4
$pieces
$a
letters
1 a
$b
[1] letters
<0 rows> (or 0-length row.names)
$c
letters
1 c
$d
[1] letters
<0 rows> (or 0-length row.names)
$progress
$progress$init
function (x)
NULL
<bytecode: 0x559669453cd0>
<environment: 0x55966e5c8b50>
$progress$step
function ()
NULL
<bytecode: 0x559669453e58>
<environment: 0x55966e5c8b50>
$progress$term
function ()
NULL
<bytecode: 0x559669453e58>
<environment: 0x55966e5c8b50>
$result
$result[[1]]
NULL
$result[[2]]
NULL
$result[[3]]
NULL
$result[[4]]
NULL
So we see that i
in parent.frame(1)
is the current subset count, and the names on pieces in parent.frame(2)
has the levels we want. Putting them together, we can get the current level:
plyr::ddply(x, "letters", .drop = F, function(xx) {
#figure out the piece
i = get("i", envir = parent.frame(1))
levels = names(get("pieces", envir = parent.frame(1)))
current_piece = levels[i]
#do something
if (current_piece == "b") print("this is the b empty group!") else print("This is not level b")
data.frame(
count = nrow(xx)
)
})
which results in:
[1] "This is not level b"
[1] "this is the b empty group!"
[1] "This is not level b"
[1] "This is not level b"
letters count
1 a 1
2 b 0
3 c 1
4 d 0
Upvotes: 0