Reputation: 1063
I have a data frame with some columns:
-2, -1, 0, 1, 2, 3, 4
for which I want the levels to be labeled as 0
or 1
following this convention:
-2 = 1
-1 = 1
0 = 0
1 = 1
2 = 1
3 = 1
4 = 0
I have the following code:
#Convert to factor
dat[idx] <- lapply(dat[idx], factor, levels = -2:4, labels = c(1, 1, 0, 1, 1, 1, 0))
#Drop unused factor levels
dat <- droplevels(dat)
This works, but it gives me the following warning:
In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels, :
duplicated levels in factors are deprecated
I tried the following code (per Ananda Mahto's suggestion) but no luck:
levels(dat[idx]) <- list(`0` = c(0, 4), `1` = c(-2, -1, 1, 2, 3))
I figured there has to be a better way to do this, any suggestions?
My data looks like this:
structure(list(Timestamp = structure(c(1380945601, 1380945603,
1380945605, 1380945607, 1380945609, 1380945611, 1380945613, 1380945615,
1380945617, 1380945619), class = c("POSIXct", "POSIXt"), tzone = ""),
FCB2C01 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), RCB2C01 = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), FCB2C02 = c(1, 1, 1, 1, 1, 1,
1, 1, 1, 1), RCB2C02 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), FCB2C03 = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), RCB2C03 = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), FCB2C04 = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1), RCB2C04 = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), FCB2C05 = c(1, 1, 1, 1, 1, 1,
1, 1, 1, 1), RCB2C05 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), FCB2C06 = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1), RCB2C06 = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), FCB2C07 = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1), RCB2C07 = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), FCB2C08 = c(1, 1, 1, 1, 1, 1,
1, 1, 1, 1), RCB2C08 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), FCB2C09 = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1), RCB2C09 = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), FCB2C10 = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1), RCB2C10 = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("Timestamp", "FCB2C01",
"RCB2C01", "FCB2C02", "RCB2C02", "FCB2C03", "RCB2C03", "FCB2C04",
"RCB2C04", "FCB2C05", "RCB2C05", "FCB2C06", "RCB2C06", "FCB2C07",
"RCB2C07", "FCB2C08", "RCB2C08", "FCB2C09", "RCB2C09", "FCB2C10",
"RCB2C10"), row.names = c(NA, 10L), class = "data.frame")
And the column index:
idx <- seq(2,21,2)
Upvotes: 3
Views: 3021
Reputation: 193587
If I correctly understand what you want to do, the "right" way would be to use the levels
function to specify your levels. Compare the following:
set.seed(1)
x <- sample(-2:4, 10, replace = TRUE)
YourApproach <- factor(x, levels = -2:4, labels = c(1, 1, 0, 1, 1, 1, 0))
# Warning message:
# In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels, :
# duplicated levels in factors are deprecated
YourApproach
# [1] 1 0 1 0 1 0 0 1 1 1
# Levels: 1 1 0 1 1 1 0
xFac <- factor(x, levels = -2:4)
levels(xFac) <- list(`0` = c(0, 4), `1` = c(-2, -1, 1, 2, 3))
xFac
# [1] 1 0 1 0 1 0 0 1 1 1
# Levels: 0 1
Note the difference in the "Levels" in each of those. This also means that the underlying numeric representation is going to be different:
> as.numeric(YourApproach)
[1] 2 3 5 7 2 7 7 5 5 1
> as.numeric(xFac)
[1] 2 1 2 1 2 1 1 2 2 2
Upvotes: 4