Reputation: 459
I'm looking for the most straightforward way to retrieve information from data frame in R. The data frame contains several dates, Day 0, Day 1, Day 2, Day 3, Day 4, Day 5, Day 6, Day 7, and Day 8. The events are listed on a specific date, and we are interested in finding events that occurred between any two consecutive days, as well as between dates where a null entry exists (e.e. in the table below this would include between Day 3 and Day 5 in row 1).
Person day0 day1 day2 day3 day4 day5 day6 day7 events
1 10 12 14 18 NA 22 32 50 20
2 11 15 19 NA NA NA 50 67 35
3 12 18 21 26 33 42 50 NA 45
4 15 24 32 NA 43 NA 54 76 40
The full data set has several thousand people.
I attempted to check between the first two days and write the event to a vector:
for(i in 1:length(days$Person)){
if(days$event[i] != NA){
if(days$day0[i] != NA){
if(days$day1[i] != NA){
if(days$day0[i] < days$events[i] & days$day1[i] > days$events[i]){
vector[i]<-events[i]
}
}
}
However, I continue to get errors.
Error in if (days$day1[i] != NA) { : missing value where TRUE/FALSE needed
Any help would be much appreciated.
Upvotes: 0
Views: 262
Reputation: 3414
data.frame
subsetting than for
loop and nested if
;data.frame
which meets your filter criteria, otherwise the output of your example is empty;NA
to any number the result is NA
, !is.na(events + day0 + day1)
is a shortened version of three nested if
.is.na
for NA
check, since e.g. 10 != NA
returns NA
.if
-condition throws an error you mentiond, if you provide it with NA
.dput(head(your_data.frame))
to provide an example of your input data as well as desired output, it will facilitate to get help from the community.Please see the code below:
days <- structure(list(Person = 1:5, day0 = c(10L, 11L, 12L, 15L, 1L),
day1 = c(12L, 15L, 18L, 24L, 20L), day2 = c(14L, 19L, 21L,
32L, 3L), day3 = c(18L, NA, 26L, NA, 4L), day4 = c(NA, NA,
33L, 43L, 5L), day5 = c(22L, NA, 42L, NA, 6L), day6 = c(32L,
50L, 50L, 54L, 7L), day7 = c(50L, 67L, NA, 76L, 8L), events = c(20L,
35L, 45L, 40L, 10L)), class = "data.frame", row.names = c(NA,
-5L))
vector <- subset(days, !is.na(events + day0 + day1) & day0 < events & day1 > events)[["events"]]
vector
Output is a vector of numbers of events meeting your criteria:
# [1] 10
Upvotes: 1