Eric Fail
Eric Fail

Reputation: 7928

calculate age in years and months and melt data

I'm working with some time data and I'm having problems converting a time difference to years and months.

My data looks more or less like this,

dfn <- data.frame(
Today  = Sys.time(),
DOB  = seq(as.POSIXct('2007-03-27 00:00:01'), len= 26, by="3 day"),
Patient  = factor(1:26, labels = LETTERS))

First I subtract the data of birth (DOB) form today's data (Today).

dfn$ageToday <-  dfn$Today - dfn$DOB

This gives me the Time difference in days.

dfn$ageToday
 Time differences in days
  [1] 1875.866 1872.866 1869.866 1866.866 1863.866
  [6] 1860.866 1857.866 1854.866 1851.866 1848.866
 [11] 1845.866 1842.866 1839.866 1836.866 1833.866
 [16] 1830.866 1827.866 1824.866 1821.866 1818.866
 [21] 1815.866 1812.866 1809.866 1806.866 1803.866
 [26] 1800.866
 attr(,"tzone")
 [1] ""

This is where first part of my question comes in; how do I convert this difference to years and months (rounded to months)? (i.e. 4.7, 4.11, etc.)

I read the ?difftime man page and the ?format, but I did not figure it out.

Any help would be appreciated.

Furthermore, I would like to melt my final object and if I try using melt on the data frame above using this command,

require(plyr)
require(reshape)
mdfn <- melt(dfn, id=c('Patient'))

I get this strange warning I haven't see before

Error in as.POSIXct.default(value) : 
  do not know how to convert 'value' to class "POSIXct"

So, my second question is; how do I create a time diffrence I can melt alongside my POSIXct variables? If I melt without dfn$ageToday everything works like a charm.

Thanks, Eric

Upvotes: 4

Views: 2301

Answers (1)

daedalus
daedalus

Reputation: 10923

The lubridatepackage makes working with dates and times, including finding time differences, really easy.

library("lubridate")
library("reshape2")

dfn <- data.frame(
    Today  = Sys.time(),
    DOB  = seq(as.POSIXct('2007-03-27 00:00:01'), len= 26, by="3 day"),
    Patient  = factor(1:26, labels = LETTERS))

dfn$diff <- new_interval(dfn$DOB, dfn$Today) / duration(num = 1, units = "years")

mdfn <- melt(dfn, id=c('Patient'))
class(mdfn$value) # all values are coerced into numeric

The new_interval() function calculates the time difference between two dates. Note that there is a function today() that could substitute for your use of Sys.time. Finally note the duration() function that creates a standard, ehm, duration that you can use to divide the interval by a length of standard units, in this case, a unit of one year.

In case you want to preserve the contents of Today and DOB, then you may want to convert everything to character first and reconvert later...

library("lubridate")
library("reshape2")

dfn <- data.frame(
  Today  = Sys.time(),
  DOB  = seq(as.POSIXct('2007-03-27 00:00:01'), len= 26, by="3 day"),
  Patient  = factor(1:26, labels = LETTERS))

# Create standard durations for a year and a month
one.year <- duration(num = 1, units = "years")
one.month <- duration(num = 1, units = "months")

# Calculate the difference in years as float and integer
dfn$diff.years <- new_interval(dfn$DOB, dfn$Today) / one.year
dfn$years <- floor( new_interval(dfn$DOB, dfn$Today) / one.year )

# Calculate the modulo for number of months
dfn$diff.months <- round( new_interval(dfn$DOB, dfn$Today) / one.month )
dfn$months <- dfn$diff.months %% 12

# Paste the years and months together
# I am not using the decimal point so as not to imply this is
# a numeric representation of the diference
dfn$y.m <- paste(dfn$years, dfn$months, sep = '|')

# convert Today and DOB to character so as to preserve them in melting
dfn$Today <- as.character(dfn$Today)
dfn$DOB <- as.character(dfn$DOB)

# melt using string representation of difference between the two dates
dfn2 <- dfn[,c("Today", "DOB", "Patient", "y.m")]
mdfn2 <- melt(dfn2, id=c('Patient'))

# alternative melt using numeric representation of difference in years
dfn3 <- dfn[,c("Today", "DOB", "Patient", "diff.years")]
mdfn3 <- melt(dfn3, id=c('Patient'))

Upvotes: 5

Related Questions