itjcms18
itjcms18

Reputation: 4333

replace a column in a dataframe given a corresponding vector in r

I'm looking to see if there is a quicker way to replace the days of the week in a R dataframe with a number. Essentially, the question I'm wondering is given one vector and a corresponding vector is there a quick way to apply a replacement to a dataframe.

Here is my dataframe:

   month day_of_week  skies
 1   APR     Tuesday Clear 
 2   APR   Wednesday Cloudy
 3   APR    Thursday Cloudy
 4   APR      Friday Cloudy
 5   APR    Saturday Cloudy
 6   APR      Sunday Clear 

The days of the week are in the following vector:

 daysweek <- unique(df$day_of_week)
 daysweek
 [1] Tuesday   Wednesday Thursday  Friday    Saturday  Sunday    Monday

The corresponding vector is:

 days_num <- c(2,3,4,5,6,7,1)

The long way I would do it is without the corresponding vector and using gsub individually. I was wondering if there was a quick way to do it. I couldn't figure it out with a for loop.

for (i in c(1:7)) {
  df$result <- gsub(daysweek[i], days_num[i], df$day_of_week)
}

Desired dataframe output I would want would be:

   month day_of_week  skies
 1   APR     2       Clear 
 2   APR     3       Cloudy
 3   APR     4       Cloudy
 4   APR     5       Cloudy
 5   APR     6       Cloudy
 6   APR     7       Clear 

Upvotes: 0

Views: 114

Answers (1)

akrun
akrun

Reputation: 887118

Create a index of weekdays and match with the day_of_week column.

Date <- as.Date('2014-12-29') #Monday 
Wdays <- weekdays(seq(Date, length.out=7, by= '1 day'))

df[,2] <- match(df[,2],Wdays)
df[,2] 
#[1] 2 3 4 5 6 7

Or you can convert the column to factor with levels from Monday to Sunday and convert it to numeric

as.numeric(factor(df$day_of_week, levels=c("Monday", "Tuesday",
    "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")))
#[1] 2 3 4 5 6 7

Update

If you have a vector of numeric indices that correspond the unique values in the day_of_week column

Un <- c('Tuesday',   'Wednesday', 'Thursday',  'Friday',   
        'Saturday',  'Sunday',    'Monday')
days_num <- c(2,3,4,5,6,7,1)
set.seed(24)
day_of_week <- sample(Un, 20, replace=TRUE)
unname(setNames(days_num, Un)[day_of_week])
#[1] 4 3 6 5 6 1 3 7 7 3 6 4 6 6 4 1 3 2 5 2

Because you used gsub, another option would be mgsub from qdap

 library(qdap)
 as.numeric(mgsub(Un, days_num, day_of_week))
 #[1] 4 3 6 5 6 1 3 7 7 3 6 4 6 6 4 1 3 2 5 2

or

library(qdapTools)
day_of_week %l% data.frame(Un, days_num)
 #[1] 4 3 6 5 6 1 3 7 7 3 6 4 6 6 4 1 3 2 5 2

Upvotes: 2

Related Questions