Reputation: 13792
I have this data.frame:
df <- data.frame(id = rep(c("one", "two", "three"), each = 10), week.born = NA)
df$week.born[c(5,15,28)] <- c(23,19,24)
df
id week.born
1 one NA
2 one NA
3 one NA
4 one NA
5 one 23
6 one NA
7 one NA
8 one NA
9 one NA
10 one NA
11 two NA
12 two NA
13 two NA
14 two NA
15 two 19
16 two NA
17 two NA
18 two NA
19 two NA
20 two NA
21 three NA
22 three NA
23 three NA
24 three NA
25 three NA
26 three NA
27 three NA
28 three 24
29 three NA
30 three NA
For one
all week.born
values should be 23
.
For two
all week.born
values should be 19
.
For one
all week.born
values should be 24
.
Whats the best way to do this?
Upvotes: 3
Views: 2742
Reputation: 23004
When mapping just a few combinations like this, the mapvalues
function from the plyr
package is simple:
library(plyr)
df$week.born <- mapvalues(df$id, c("one", "two", "three"), c(23, 19, 24))
Upvotes: 0
Reputation: 1035
I would create another data.frame containing the mapping and then do a simple join:
require(dplyr)
map <- data.frame(id=c("one","two","three"), new.week.born=c(23,19,24))
left_join(df, map, by="id")
# id week.born new.week.born
# 1 one NA 23
# 2 one NA 23
# ...
# 16 two NA 19
# 17 two NA 19
# 18 two NA 19
# 19 two NA 19
# 20 two NA 19
# 21 three NA 24
# 22 three NA 24
# 23 three NA 24
# ...
See benchmark below.
library(microbenchmark)
library(dplyr) # v 0.4.1
library(data.table) # v 1.9.5
df <- data.frame(id = rep(c("one", "two", "three"), each = 1e6))
df2 <- copy(df)
map <- data.frame(id=c("one","two","three"), new.week.born=c(23,19,24))
dplyr_join <- function() {
left_join(df, map, by="id")
}
r_merge <- function() {
merge(df, map, by="id")
}
data.table_join <- function() {
setkey(setDT(df2))[map]
}
Unit: milliseconds
expr min lq mean median uq max neval
dplyr_join() 409.10635 476.6690 910.6446 489.4573 705.4021 2866.151 10
r_merge() 41589.32357 47376.0741 55719.1752 50133.0918 54636.3356 83562.931 10
data.table_join() 94.14621 132.3788 483.4220 225.3309 1051.7916 1416.946 10
Upvotes: 6
Reputation: 16697
@cho7tom is OK if you have just few groups, otherwise you may prefer to have a lookup table and make a join to that table to lookup week.born
value based on id
.
base R
df <- data.frame(id = rep(c("one", "two", "three"), each = 10))
lkp <- data.frame(id=c("one","two","three"), week.born=c(23,19,24))
merge(df, lkp, by="id")
Or using binary join from data.table
library(data.table)
setkey(setDT(df))[lkp]
Upvotes: 2
Reputation: 31161
You can do:
library(data.table)
setDT(df)[,week.born:=week.born[!is.na(week.born)][1], by=id]
Or base R
using ave
:
df$week.born = with(df, ave(week.born, id, FUN=function(u) u[!is.na(u)][1]))
Upvotes: 3
Reputation: 1070
One solution is:
df$week.born[df$id == "one"] <- 23
df$week.born[df$id == "two"] <- 19
df$week.born[df$id == "three"] <- 24
Regards
Upvotes: 2