Reputation: 145
I'm working with an animal tracking dataset in which I have locations of birds captured across three different states. I'm trying to filter the dataset into the minimum time birds crossed a certain Latitude with the Latitude varying by state. My data looks something like this:
df <- data.frame(
BirdsID = c("A", "A", "A",
"B", "B", "B",
"C","C", "C"),
State = c("AR", "AR", "AR",
"LA", "LA", "LA",
"TN", "TN", "TN"),
Latitude = c(31, 37, 38,
29, 31, 32,
35, 36, 37),
Time = as.Date(c("2020/04/02", "2020/04/03", "2020/04/04",
"2020/04/03", "2020/04/04", "2020/04/05",
"2020/04/05", "2020/04/06", "2020/04/07")))
What I'd like to do is filter the data by Latitude, conditionally upon state. In this case, I'd like to isolate locations of birds from AR above Latitude 36 degrees, LA with Latitude > 30.6, and TN Latitude > 36.5. After filtering, I'd like to distill the data to the minimum time (i.e., the first occasion they were observed above the specified latitude). Here's an attempt which throws an error:
df2 <- df %>%
if_else(State == "AR", true = filter(Latitude > 36), #If AR, filter >36deg
if_else(State == "LA", true = filter(Latitude > 30.6), #If LA, filter >30.6deg
false = filter(Latitude > 36.5) #else, its TN and filter >36.5
)
) %>%
group_by(BirdsID) %>% #group by bird
filter(Time == min(Time)) #earliest time above filtered Latitude
The error I'm receiving for this example is Error: "condition" must be a logical vector, not a "data.frame" object.
and the error I'm receiving on my actual dataset is Error: "condition" must be a logical vector, not a "tbl_df/tbl/data.frame" object.
Any suggestions or assistance w/ nested ifelse
, if_else
, or if()
statements would be appreciated. Best,
Upvotes: 1
Views: 471
Reputation: 21
here is an example of nested if_else formula:
dat1 <- dat %>%
mutate (group = ifelse(subject > 2 & subject < 21 | subject == 1 | subject == 22, "patient", "control")) %>%
mutate (condition = ifelse(stim < 9, "pain", ifelse (stim > 8 & stim < 17, "dep", ifelse(stim > 16 & stim < 25, "pos", "neu"))))
Upvotes: 0
Reputation: 4487
Here is a way to do what you want with group_map
from dplyr
library(dplyr)
# Define a function to apply custom filter by state
custom_filter <- function(data) {
State <- first(data[["State"]])
if (State == "AR") {
result <- data %>%
filter(Latitude > 36)
} else if (State == "LA") {
result <- data %>%
filter(Latitude > 30.6)
} else {
result <- data %>%
filter(Latitude > 36.5)
}
result
}
df2 <- df %>%
group_by(State) %>%
group_map(.f = ~ custom_filter(.x), .keep = TRUE) %>%
bind_rows() %>%
group_by(BirdsID) %>%
filter(Time == min(Time))
Output
# A tibble: 3 x 4
# Groups: BirdsID [3]
BirdsID State Latitude Time
<chr> <chr> <chr> <date>
1 A AR 37 2020-04-03
2 B LA 31 2020-04-04
3 C TN 37 2020-04-07
Upvotes: 0
Reputation: 33772
You can use dplyr::case_when
instead of multiple or nested ifelse
.
I would use it to flag the data, then filter on the flag. Something like this:
library(dplyr)
df %>%
mutate(flag = case_when(
State == "AR" & Latitude > 36 ~ "Y",
State == "LA" & Latitude > 30.6 ~ "Y",
State == "TN" & Latitude > 36.5 ~ "Y",
TRUE ~ "N"
)) %>%
filter(flag == "Y") %>%
group_by(State, BirdsID) %>%
filter(Time == min(Time))
Result:
# A tibble: 3 x 5
# Groups: State, BirdsID [3]
BirdsID State Latitude Time flag
<chr> <chr> <dbl> <date> <chr>
1 A AR 37 2020-04-03 Y
2 B LA 31 2020-04-04 Y
3 C TN 37 2020-04-07 Y
Data - please use data.frame
!
df <- data.frame(BirdsID = c("A", "A", "A",
"B", "B", "B",
"C","C", "C"),
State = c("AR", "AR", "AR",
"LA", "LA", "LA",
"TN", "TN", "TN"),
Latitude = Latitude <- c(31, 37, 38,
29, 31, 32,
35, 36, 37),
Time = as.Date(c("2020/04/02", "2020/04/03", "2020/04/04",
"2020/04/03", "2020/04/04", "2020/04/05",
"2020/04/05", "2020/04/06", "2020/04/07")))
Upvotes: 1