Reputation: 409
I am manipulating data containing dates and am having a bit of trouble. Essentially I wish to calculate a new date based on two existing dates and another variable, for all rows in my dataframe. For example, I would like to be able to subtract 10 days from Date1, or calculate the date that is midway between Date1 and Date2, etc. However I am having trouble understanding class assignment when adding the new calculated date to the dataframe. Sample dataframe:
# Uncomment to clear your session...
# rm(list = ls(all = TRUE))
tC <- textConnection("StudyID Date1 Date2
C0031 2-May-09 12-Jan-10
C0032 7-May-09 30-Apr-10")
data <- read.table(header=TRUE, tC)
close.connection(tC)
rm(tC)
#CONVERTING TO DATES
data$Date1 <- with(data,as.Date(Date1,format="%d-%b-%y"))
data$Date2 <- with(data,as.Date(Date2,format="%d-%b-%y"))
Now here is where my problem begins
class(data[1, "Date2"] - 10) # class is "Date". So far so good.
data[1, "newdate"] <- (data[1, "Date2"] - 10)
class(data[1, "newdate"]) # class is now "numeric"...
And tried
data[1, "newdate"] <- as.Date(data[1, "Date2"] - 10)
class(data[1, "newdate"]) # doesn't help. Class still "numeric"...
Just not understanding why this value becomes numeric when assigned to data
Upvotes: 1
Views: 134
Reputation: 59970
The problem is due to recycling of your vector stripping attributes. As I stated in my comment, use e.g. data$newdate <- data$Date1 - 10
to create the whole column without recycling the vector, thus retaining attributes such as Date
. Consider the illustrative toy example below:
# Simple vector with an attribute
x <- 1:3
attributes(x) <- list( att = "some attributes" )
x
#[1] 1 2 3
#attr(,"att")
#[1] "some attributes"
# Simple data.frame with 3 rows
df <- data.frame( a = 1:3 )
# New column using first element of vector with attributes
df$b <- x[1]
# It is recycled to correct number of rows and attributes are stripped
str(df$b)
# int [1:3] 1 1 1
# Without recycling attributes are retained
df$c <- x
str(df$c)
# atomic [1:3] 1 2 3
# - attr(*, "att")= chr "some attributes"
# But they all look the same...
df
# a b c
#1 1 1 1
#2 2 1 2
#3 3 1 3
And from your data..
attributes(data$Date1)
# $class
# [1] "Date"
Upvotes: 0
Reputation: 81693
The problem is due to the nonexistence of column newdate
in combination with assigning a single value:
# create a single value in a new column
data[1, "newdate"] <- data[1, "Date2"] - 10
class(data[1, "newdate"]) # numeric
# create the whole column
data[ , "newdate2"] <- data[1, "Date2"] - 10
class(data[1, "newdate2"]) # Date
# create a column of class Date before assigning value
data[ , "newdate3"] <- as.Date(NA)
data[1, "newdate3"] <- data[1, "Date2"] - 10
class(data[1, "newdate3"]) # Date
By the way, you don't need as.Date
when performing mathematical operations with Date
objects.
Upvotes: 2