Reputation: 2166
I love data table and use it to conditionally rename or add factors. However, I can't seem to do more than one factor at a time. Here is an example:
a <- rep(c("A", "B", "C", "D"), each=3)
b <- 1:12
df <- data.frame(a,b)
DT <- data.table(df)
Now add new column "New" which for all "A"s in column "a" is equal to "z"
DT[a=="A", New:="z"]
This works nicely. Now if I want to change say "A" and "C" to be equal to "z":
DT[a==c("A", "C"), New:="z"]
Gives me funny answers:
dput(DT)
structure(list(a = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
3L, 4L, 4L, 4L), .Label = c("A", "B", "C", "D"), class = "factor"),
b = 1:12, New = c("z", NA, "z", NA, NA, NA, NA, "z", NA,
NA, NA, NA)), .Names = c("a", "b", "New"), row.names = c(NA,
-12L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000000000140788>, index = structure(integer(0), a = integer(0)))
I'm sure it's something simple, I can't seem to find it on SO (queue the dupe!). Thanks
To confirm, my desired output is:
dput(DT)
structure(list(a = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
3L, 4L, 4L, 4L), .Label = c("A", "B", "C", "D"), class = "factor"),
b = 1:12, New = c("z", "z", "z", NA, NA, NA, "z", "z", "z",
NA, NA, NA)), .Names = c("a", "b", "New"), row.names = c(NA,
-12L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000000000140788>, index = structure(integer(0), a = integer(0)))
Upvotes: 1
Views: 95
Reputation: 83215
You should use %in%
instead of ==
, thus you will need:
DT[a %in% c("A", "C"), New:="z"]
which gives:
> DT
a b New
1: A 1 z
2: A 2 z
3: A 3 z
4: B 4 NA
5: B 5 NA
6: B 6 NA
7: C 7 z
8: C 8 z
9: C 9 z
10: D 10 NA
11: D 11 NA
12: D 12 NA
Used data:
a <- rep(c("A", "B", "C", "D"), each=3)
b <- 1:12
DT <- data.table(a,b)
In a dataframe you could do:
df <- data.frame(a,b)
df$New <- NA
df[df$a %in% c("A", "C"), "New"] <- "z"
to achieve the same result.
Upvotes: 5