Brandon McCormick
Brandon McCormick

Reputation: 73

Formatting days of the week in R

I have a variable in my data frame for day of the week.

> str(g.2015.1990$DAY.OF.WEEK)
 Factor w/ 7 levels "Friday","Monday",..: 1 3 4 2 6 7 5 1 3 4 ...

R recognizes this as a factor, but is there already a specific format for day of the week I can use instead? I've read questions about generating a day of the week, or specifying the day of a week for a date that you already have; however, I've not read anything about changing the format of a variable you already have into day of the week.

This will probably ultimately be irrelevant to my research; but, I'd feel better moving forward if it were properly formatted. I can't see where this would come up; but, if sequencing ever became an issue, R sequences the factor variable in alphabetical order (Friday, Monday, Saturday, etc.) where as, obviously, chronological order (Sunday, Monday, Tuesday, etc.) would be desirable.

Here is what I've tried:

dayx = as.Date(g.2015.1990$DAY.OF.WEEK, format = "%A")
dayx = as.Date(as.character(g.2015.1990$DAY.OF.WEEK), format = "%A")
dayx = strptime(g.2015.1990$DAY.OF.WEEK, format = "%A")
dayx = strftime(as.character(g.2015.1990$DAY.OF.WEEK, format = "%A"))
dayx = strptime(g.2015.1990$DAY.OF.WEEK, format = "%a")
dayx = as.Date(g.2015.1990$DAY.OF.WEEK, format = "%a")
dayx = as.Date(as.character(g.2015.1990$DAY.OF.WEEK), format = "%a")
dayx = strftime(as.character(g.2015.1990$DAY.OF.WEEK, format = "%a"))
dayx = strptime(sprintf('%s %04d', g.2015.1990$DATE, g.2015.1990$START.TIME, g.2015.1990$DAY.OF.WEEK), '%Y-%m-%d %H%M %a')

Each one seems to simply replace each observation with today's date:

> dayx = as.Date(g.2015.1990$DAY.OF.WEEK, format = "%A")
> dayx[1:25]
 [1] "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23"
 [6] "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23"
[11] "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23"
[16] "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23"
[21] "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23" "2016-07-23"

Any help is appreciated!

Upvotes: 0

Views: 561

Answers (1)

Zheyuan Li
Zheyuan Li

Reputation: 73265

I think this is relevant:

## This is the order you desire
Weekdays <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")

## This simulates your `g.2015.1990$DAY.OF.WEEK`
set.seed(0); test <- factor(sample(Weekdays, 100, replace = TRUE))

## This simulates what you see from `str(g.2015.1990$DAY.OF.WEEK)`
str(test)
# Factor w/ 7 levels "Friday","Monday",..: 3 2 6 5 3 2 3 3 5 5 ...

## We can inspect levels
levels(test)
#[1] "Friday"    "Monday"    "Saturday"  "Sunday"    "Thursday"  "Tuesday"  
#[7] "Wednesday"

## This is what you should do to recode `test` for your desired order of levels
tmp <- levels(test)[as.integer(test)]  ## much more efficient than `tmp <- as.character(test)`
test <- factor(tmp, levels = Weekdays) ## set levels when using `factor()`

## This is what we see now
str(test)
# Factor w/ 7 levels "Sunday","Monday",..: 7 2 3 5 7 2 7 7 5 5 ...

levels(test)
# [1] "Sunday"    "Monday"    "Tuesday"   "Wednesday" "Thursday"  "Friday"   
# [7] "Saturday" 

So, altogether, try:

Weekdays <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
tmp <- levels(g.2015.1990$DAY.OF.WEEK)[as.integer(g.2015.1990$DAY.OF.WEEK)]
## use `Weekdays` defined above
g.2015.1990$DAY.OF.WEEK <- factor(tmp, levels = Weekdays)

Upvotes: 1

Related Questions