user5680053
user5680053

Reputation: 33

Create a new column on a data frame containing max date for each row

I am currently working on life tables, and I have a data set with 19 columns.Column 5 to column 19 contains the dates for each birth an individual had. I want to create a new variable (column 20) which contains the latest birth (last birth) for each row across 5th to 19th column. The data entries belong to factor class.

Here is how my data looks like

ID_I        Sex     BirthDate   DeathDate   Parturition1    Parturition2    
501093007   Female  1813-01-14  1859-09-29  1847-11-16      1850-05-17
400707003   Female  1813-01-15  1888-04-14  1844-10-07      1845-10-17
100344004   Female  1813-02-06  1897-05-07  1835-03-09      1837-01-03

I have tried the code, suggested in one of the answers;

df[, "max"] <- apply(df[, 5:19], 1, max)

But I get the overall max across all the rows for the variable df$max. Could it be because my date entries aren't numeric or character?

Upvotes: 2

Views: 3335

Answers (2)

akrun
akrun

Reputation: 887991

Based on the example data, we can also use pmax after converting to 'Date' class

df1$max.date <- do.call(pmax,lapply(df1[3:ncol(df1)], as.Date))
df1$max.date
#[1] "1859-09-29" "1888-04-14" "1897-05-07"

NOTE: Change the 3 to 5 in (3:ncol(df1)) in the original dataset.

Upvotes: 0

mtoto
mtoto

Reputation: 24198

You're almost there, this should work:

df$max.date <- apply(df[,5:19],1,max)

Upvotes: 1

Related Questions