Reputation: 13
It has been stated several times that dplyr will drop rownames, and now version 0.3 has done so.
I have often used row names to translate between different identifiers kept in a data frame like this:
test <- data.frame(Greek = c("Alpha", "Beta", "Gamma"), Letters = LETTERS[1:3])
rownames(test) <-test$Letters
lookup <- c("C", "B")
test[lookup, "Greek"]
[1] Gamma Beta
Levels: Alpha Beta Gamma
Due to a lack of rownames this now fails with dplyr
library(dplyr)
test <- tbl_df(data.frame(Greek = c("Alpha", "Beta", "Gamma"), Letters = LETTERS[1:3]))
rownames(test) <-test$Letters
lookup <- c("C", "B")
test[lookup, "Greek"]
Source: local data frame [2 x 1]
Greek
1 NA
2 NA
I've tried using filter() and select(), but couldn't find a solution which preserves the order of the lookup.
Upvotes: 1
Views: 2463
Reputation: 66874
This is one time where you can play with match
es:
test[match(lookup,test$Letters),"Greek"]
[1] Gamma Beta
Levels: Alpha Beta Gamma
And you can wrap in a do
to make it dplyr-ic:
test %>% do(`[`(.,match(lookup,.$Letters),)) %>% select(Greek)
Source: local data frame [2 x 1]
Greek
1 Gamma
2 Beta
Or as @hadley mentions, left_join
does what you are looking for:
left_join(data.frame(Letters=lookup),test) %>% select(Greek)
Joining by: "Letters"
Greek
1 Gamma
2 Beta
Upvotes: 2