Reputation: 135
I am struggling with some data manipulation. One of the columns in my datasheet contains the date of birth, but for one location the values are off by 100 years.
I made an example small data frame to explain my problem: the dates for Paris / Berlin are correct, I want to change the date only for those rows with London as location (for this example from 2028-3-25 to 1928-3-25).
library(lubridate)
date <- as.Date(c('1950-11-1','2028-3-25','1940-3-14'))
location <- c("Paris", "London", "Berlin")
df <- data.frame(date, location)
df$date_new <- ifelse(df$location %in% c("London"), df$date - years(100), df$date)
As you can see, I installed the lubridate package and tried to use an if else statement, but that just gives me some negative numbers in the new column.
The solution is probably very simple, but I cannot figure it out and it's driving me insane.
Thank you!
Upvotes: 4
Views: 643
Reputation: 17412
ifelse
is taking the class attributes from the test:
The mode of the result may depend on the value of test (see the examples), and the class attribute (see oldClass) of the result is taken from test and may be inappropriate for the values selected from yes and no.
Sometimes it is better to use a construction such as
(tmp <- yes; tmp[!test] <- no[!test]; tmp)
, possibly extended to handle missing values in test.
So it looks like it's best not to use ifelse. Here's one solution:
> df$date_new = df$date
> df[location == "London",]$date_new = df[location == "London",]$date_new - years(100)
> df
date location date_new
1 1950-11-01 Paris 1950-11-01
2 2028-03-25 London 1928-03-25
3 1940-03-14 Berlin 1940-03-14
However, if you want to use ifelse, you can coerce the object into a Date if you specify the standard origin (an object in R)
> library(lubridate)
> date <- as.Date(c('1950-11-1','2028-3-25','1940-3-14'))
> location <- c("Paris", "London", "Berlin")
> df <- data.frame(date, location)
> df$date_new <- as.Date(ifelse(df$location == "London", as.Date(df$date - years(100)), df$date), origin = origin)
> df
date location date_new
1 1950-11-01 Paris 1950-11-01
2 2028-03-25 London 1928-03-25
3 1940-03-14 Berlin 1940-03-14
Upvotes: 3
Reputation: 5951
Try this as an alternative
df$date_new <- df$date
df$date_new[df$location=="London"] <- df$date_new[df$location=="London"] - years(100)
or instead of df$date_new <- ifelse(df$location %in% c("London"), df$date - years(100), df$date)
try
df$date_new <- ifelse(df$location %in% c("London"), as.character(df$date - years(100)), as.character(df$date))
Upvotes: 4