Reputation: 193
I am trying to create a function that shows how many "person-years" an individual has contributed to a given age-group in a given period. If the person is alive during the specified interval, the person contributes to the time-interval. For example, for the age-group 0-1, an individual who came under observation at age 0.5 and left at age 3 will have contributed 0.5 years to the person-years for the 0-1 age group.
I've been able to run this code successfully over a for-loop, but it takes forever, so I'm trying to implement a vector-based function instead. The function works fine for individual entries, but cannot handle the vectors I pass to it, giving the error: "...the condition has length > 1 and only the first element will be used"
The function I've written is as follows:
pyears01.smm <- function(ageent, ageleave) {
if ( is.na(ageent) | is.na(ageleave) )
{NA} else
if( ageent > 1 )
{0}
if ( ageent <= 1 && ageleave > 1 )
{1-ageent} else
if( ageent <= 1 && ageleave <= 1 )
{ageleave-ageent}
}
which works fine for evaluating the following:
pyears.smm(0,5)
[1] 1
pyears.smm(0.5,0.75)
[1] 0.25
pyears.smm(2,3)
[1] 0
but does not evaluate NAs correctly:
> pyears.smm(NA,NA)
[1] 0
> pyears.smm("NA",5)
[1] 0
and doesn't handle vectors correctly:
x <- c(0,0.5,2,5)
y <- c(5,0.75,3,NA)
z<- pyears.smm(x,y)
Warning message:
In if (!is.na(ageent) & ageent <= 1 & !is.na(ageleave) & ageleave > :
the condition has length > 1 and only the first element will be used
> z
[1] 1.0 0.5 -1.0 -4.0
I have read that elseif takes vectors while if statements like this can only evaluate single elements, but I have several layers of nested if statements, so I'm not sure how to fix this. Any suggestions would be appreciated. Thanks!
Upvotes: 0
Views: 1807
Reputation: 263332
The problem you are trying to solve has already been addressed in two package that I am aware of: "survival" and "epi". You are (unnecessarily) reinventing the Lexis diagram.
Upvotes: 0
Reputation: 57686
The vectorised form of an if
-else
construct is ifelse
(not elseif). However, you don't really need it for this exercise. Instead, use pmax
and pmin
to get the (elementwise) upper and lower bounds for the exposure interval for each observation, and also to handle the case where the ages at entry and exit are outside the interval entirely.
pyears01.smm <- function(ageent, ageleave)
pmax(0, pmin(ageleave, 1) - pmax(ageent, 0))
Upvotes: 2
Reputation: 69171
The warning message you are getting is a common one, especially if you are coming from another programming language. You are looking for the ifelse()
function, which operates on vectors. As the warning message told you, it only evaluated the first condition. Here's the ifelse()
version of your code:
pyears01.smm2 <- function(ageent, ageleave){
ifelse(is.na(ageent) | is.na(ageleave), NA
, ifelse(ageent > 1,0
, ifelse(ageent <= 1 & ageleave > 1, 1 - ageent, ageleave - ageent)))
}
> pyears01.smm2(NA, NA)
[1] NA
> pyears01.smm2(NA, 5)
[1] NA
> x <- c(0,0.5,2,5)
> y <- c(5,0.75,3,NA)
> pyears01.smm2(x,y)
[1] 1.00 0.25 0.00 NA
If you Google or search on SO for differences between if else
and ifelse()
, I'm sure you'll find some good stuff. Here's one link that rose to the top: http://rwiki.sciviews.org/doku.php?id=tips:programming:ifelse
Upvotes: 3