Reputation: 4333
I'm looking to see if there is a quicker way to replace the days of the week in a R dataframe with a number. Essentially, the question I'm wondering is given one vector and a corresponding vector is there a quick way to apply a replacement to a dataframe.
Here is my dataframe:
month day_of_week skies
1 APR Tuesday Clear
2 APR Wednesday Cloudy
3 APR Thursday Cloudy
4 APR Friday Cloudy
5 APR Saturday Cloudy
6 APR Sunday Clear
The days of the week are in the following vector:
daysweek <- unique(df$day_of_week)
daysweek
[1] Tuesday Wednesday Thursday Friday Saturday Sunday Monday
The corresponding vector is:
days_num <- c(2,3,4,5,6,7,1)
The long way I would do it is without the corresponding vector and using gsub individually. I was wondering if there was a quick way to do it. I couldn't figure it out with a for loop.
for (i in c(1:7)) {
df$result <- gsub(daysweek[i], days_num[i], df$day_of_week)
}
Desired dataframe output I would want would be:
month day_of_week skies
1 APR 2 Clear
2 APR 3 Cloudy
3 APR 4 Cloudy
4 APR 5 Cloudy
5 APR 6 Cloudy
6 APR 7 Clear
Upvotes: 0
Views: 114
Reputation: 887118
Create a index
of weekdays
and match
with the day_of_week
column.
Date <- as.Date('2014-12-29') #Monday
Wdays <- weekdays(seq(Date, length.out=7, by= '1 day'))
df[,2] <- match(df[,2],Wdays)
df[,2]
#[1] 2 3 4 5 6 7
Or you can convert the column to factor
with levels from Monday
to Sunday
and convert it to numeric
as.numeric(factor(df$day_of_week, levels=c("Monday", "Tuesday",
"Wednesday", "Thursday", "Friday", "Saturday", "Sunday")))
#[1] 2 3 4 5 6 7
If you have a vector
of numeric indices that correspond the unique
values in the day_of_week
column
Un <- c('Tuesday', 'Wednesday', 'Thursday', 'Friday',
'Saturday', 'Sunday', 'Monday')
days_num <- c(2,3,4,5,6,7,1)
set.seed(24)
day_of_week <- sample(Un, 20, replace=TRUE)
unname(setNames(days_num, Un)[day_of_week])
#[1] 4 3 6 5 6 1 3 7 7 3 6 4 6 6 4 1 3 2 5 2
Because you used gsub
, another option would be mgsub
from qdap
library(qdap)
as.numeric(mgsub(Un, days_num, day_of_week))
#[1] 4 3 6 5 6 1 3 7 7 3 6 4 6 6 4 1 3 2 5 2
or
library(qdapTools)
day_of_week %l% data.frame(Un, days_num)
#[1] 4 3 6 5 6 1 3 7 7 3 6 4 6 6 4 1 3 2 5 2
Upvotes: 2