Henk
Henk

Reputation: 3656

Order factor levels in order of appearance in data set

I have a survey in which a unique ID must be assigned to questions. Some questions appear multiple times. This means that there is an extra layer of questions. In the sample data below only the first layer is included.

Question: how do I assign a unique index by order of appearance? The solution provided here works alphabetically. I can order the factors, but this defeats the purpose of doing it in R [there are many questions to sort].

library(data.table)
dt = data.table(question = c("C", "C", "A", "B", "B", "D"), 
                value = c(10,20,30,40,20,30))

dt[, idx := as.numeric(as.factor(question))]

gives:

  question value idx
# 1:        C    10   3
# 2:        C    20   3
# 3:        A    30   1
# 4:        B    40   2
# 5:        B    20   2
# 6:        D    30   4

# but required is:
dt[, idx.required := c(1, 1, 2, 3, 3, 4)]

Upvotes: 9

Views: 2209

Answers (2)

David Arenburg
David Arenburg

Reputation: 92292

I think the data.table way to do this will be

dt[, idx := .GRP, by = question]

##    question value idx
## 1:        C    10   1
## 2:        C    20   1
## 3:        A    30   2
## 4:        B    40   3
## 5:        B    20   3
## 6:        D    30   4

Upvotes: 8

lukeA
lukeA

Reputation: 54237

You could respecify the factor levels:

dt[, idx := as.numeric(factor(question, levels=unique(question)))]
#    question value idx
# 1:        C    10   1
# 2:        C    20   1
# 3:        A    30   2
# 4:        B    40   3
# 5:        B    20   3
# 6:        D    30   4

Upvotes: 8

Related Questions