Reputation: 20130
I have two data.tables, dt
is a long one with an integer column levels
in the range 1...5, and another data.table "labels" containing labels in a simple form like this:
labels <- data.table(V1=1:5, V2=c("Very Low", "Low", "Median", "High", "Very High"))
# V1 V2
# 1: 1 Very Low
# 2: 2 Low
# 3: 3 Median
# 4: 4 High
# 5: 5 Very High
The actual dt
is rather large, but for reproducibility a simple one will do (though in real DT levels are not that regular):
dt <- data.table(levels=rep(1:5, times=10))
How I could replace levels column in dt
with character labels from labels
in one go?
I could do this in manual loop (ugly!), or I could do this by adding another column, like this:
dt[, tmp := labels$V2[dt$level] ]
and then dropping column level
and renaming tmp
.
Is there a good data.table way to do so?
Upvotes: 2
Views: 1501
Reputation: 83275
The easiest approach is joining the data.tables. In order to show the effect I added an id
column to dt
(see below). You can join the data.tables as follows:
dt[labels, on=c("levels"="V1")][order(id)] # the [order(id)] part is not necessary, but added to show the effect better
which gives (first 7 rows):
levels id V2
1: 1 1 Very Low
2: 2 2 Low
3: 3 3 Median
4: 4 4 High
5: 5 5 Very High
6: 1 6 Very Low
7: 2 7 Low
....
Or probably even better:
dt <- dt[labels, .(id,levels=V2), on=c("levels"="V1")][order(id)]
which gives (first 7 rows):
> dt
id levels
1: 1 Very Low
2: 2 Low
3: 3 Median
4: 4 High
5: 5 Very High
6: 6 Very Low
7: 7 Low
....
Another option is to use the match
function with the labels
data.table as a lookup table:
dt[, levels := labels$V2[match(levels, labels$V1)]]
which gives:
> dt
levels id
1: Very Low 1
2: Low 2
3: Median 3
4: High 4
5: Very High 5
6: Very Low 6
7: Low 7
....
Used data:
dt <- data.table(levels=rep(1:5, times=10))[,id:=.I]
labels <- data.table(V1=1:5, V2=c("Very Low", "Low", "Median", "High", "Very High"))
Upvotes: 5
Reputation: 1159
Suppose that your datasets are generated like this:
dt <- data.table(levels=rep(1:5, times=10))
labels <- data.table(V1=1:5, V2=c("Very Low", "Low", "Median", "High", "Very High"))
Then you can "relabel" the levels of dt
using the factor
function:
dt[, level := as.character(factor(level, labels = labels$V2))]
If you don't mind level being of type factor
, you can skip the as.character
and just do:
dt[, level := factor(level, labels = labels$V2)]
Upvotes: 3