Reputation: 79
I have data in a nested list structure in R and I'd like to use a lookup table to change names no matter where they are in the structure. Example
# build up an example
x <- as.list(c("a" = NA))
x[[1]] <- vector("list", 4)
names(x[[1]]) <- c("b","c","d","e")
x$a$b <- vector("list", 2)
names(x$a$b) <- c("d","f")
x$a$c <- 3
x$a$d <- 27
x$a$e <- "d"
x$a$b$d <- "data"
x$a$b$f <- "more data"
# make a lookup table for names I want to change from; to
lkp <- data.frame(matrix(data = c("a","z","b","bee","d","dee"),
ncol = 2,
byrow = TRUE), stringsAsFactors = FALSE)
names(lkp) <- c("from","to")
Output from the above
> x
$a
$a$b
$a$b$d
[1] "data"
$a$b$f
[1] "more data"
$a$c
[1] 3
$a$d
[1] 27
$a$e
[1] "d"
> lkp
from to
1 a z
2 b bee
3 d dee
Here is what I came up with to do this for only the first level:
> for(i in 1:nrow(lkp)){
+ names(x)[names(x) == lkp$from[[i]]] <- lkp$to[[i]]
+ }
> x
$z
$z$b
$z$b$d
[1] "data"
$z$b$f
[1] "more data"
$z$c
[1] 3
$z$d
[1] 27
$z$e
[1] "d"
So that works fine but uses a loop and only gets at the first level. I've tried various versions of the *apply world but have not yet been able to get something useful.
Thanks in advance for any thoughts
EDIT: Interestingly rapply fails miserably (or, I fail miserably in my attempt!) when trying to access and modify names. Here's an example of just trying to change all names the same
> namef <- function(x) names(x) <- "z"
> rapply(x, namef, how = "list")
$a
$a$b
$a$b$d
[1] "z"
$a$b$f
[1] "z"
$a$c
[1] "z"
$a$d
[1] "z"
$a$e
[1] "z"
Upvotes: 2
Views: 920
Reputation: 6234
Using an external package you can also do this with rrapply
in the rrapply
-package (extension of base rapply
):
library(rrapply) ## v1.2.1
rrapply(list(x),
classes = "list",
f = function(x) {
newnames <- lkp$to[match(names(x), lkp$from)]
names(x)[!is.na(newnames)] <- newnames[!is.na(newnames)]
return(x)
},
how = "recurse"
)[[1]]
#> $z
#> $z$bee
#> $z$bee$dee
#> [1] "data"
#>
#> $z$bee$f
#> [1] "more data"
#>
#>
#> $z$c
#> [1] 3
#>
#> $z$dee
#> [1] 27
#>
#> $z$e
#> [1] "d"
Here, the f
function achieves essentially the same as OP's for
-loop. how = "recurse"
tells the function to continue recursion after the application of f
.
Note that the input is wrapped as list(x)
so that the f
function also modifies the name(s) of the list itself.
Update
rrapply
v1.2.5 contains a dedicated option how = "names"
to replace names in a nested list, which is a bit less convoluted:
rrapply(
x,
f = function(x, .xname) {
newname <- lkp$to[match(.xname, lkp$from)]
return(ifelse(is.na(newname), .xname, newname))
},
how = "names"
)
#> $z
#> $z$bee
#> $z$bee$dee
#> [1] "data"
#>
#> $z$bee$f
#> [1] "more data"
#>
#>
#> $z$c
#> [1] 3
#>
#> $z$dee
#> [1] 27
#>
#> $z$e
#> [1] "d"
Upvotes: 1
Reputation: 1434
I used a character
vector for look-up instead of you data.frame
, but it will be easy to change it if you really want a data.frame
.
lkp2 <- lkp$to
names(lkp2) <- lkp$from
rename <- function(nested_list) {
found <- names(nested_list) %in% names(lkp2)
names(nested_list)[found] <- lkp2[names(nested_list)[found]]
nested_list %>% map(~{
if (is.list(.x)) {
rename(.x)
} else {
.x
}
})
}
rename(x)
# $z
# $z$bee
# $z$bee$dee
# [1] "data"
#
# $z$bee$f
# [1] "more data"
#
#
# $z$c
# [1] 3
#
# $z$dee
# [1] 27
#
# $z$e
# [1] "d"
I am not sure this is the best way to do it, but it seems to do the job, and if you're only working with small lists (like XML documents) then there is no need to worry much about performance.
You might want to name the function with a better name.
Upvotes: 2