James Weitz
James Weitz

Reputation: 31

Vectorised solution to for loop

Is there a vectorised solution for the below for loop. It is a large dataset containing admission data to a medical facility.

EDITED

library(lubridate)

dateSeq  <- as.Date(c("2015-01-01", "2015-02-01"))

admissionDate  <- as.Date(c("2015-01-03", "2015-01-06", "2015-01-10", "2015-01-05", "2015-01-07", "2015-02-03", "2015-02-06"))
Dfactor  <- c("elective", "acute", "elective", "acute", "acute", "elective", "acute")
Dfactor  <- factor(Dfactor)
df  <- data.frame(admissionDate, Dfactor)
# loop through large dataset collecting tabulated data from a factorised vector for each month (admissions date) based on 'dateSeq'


Dfactorsums  <- c()

for (i in 1:length(dateSeq)) {
    monthSub  <- df[(df$admissionDate >= as.Date(timeFirstDayInMonth(dateSeq[i]))) & (df$admissionDate <= as.Date(timeLastDayInMonth(dateSeq[i]))), ]
    x  <- table(monthSub$Dfactor)
    Dfactorsums[i]  <- as.numeric((x[1]))
}

print(Dfactorsums)   
# Outcome = [1] 3 1
# Question is rather than use a for loop is there a 'vectorized' solution.

Upvotes: 0

Views: 87

Answers (1)

rosscova
rosscova

Reputation: 5600

This isn't technically "vectorised", but should do what you're after, and should be pretty quick.

library( data.table )
setDT( df )

df[ , month := format( AdmissionsDate, "%m" ) ]
df[ , table( Dfactor )[2], by = month ]

We set a column as the month in order to make subsetting by month easier, then extract the value you need for each month. This should output a two column data table, with the second column equal to your Dfactor output vector.

Upvotes: 1

Related Questions