Reputation: 154
I have a dataset looking like this:
df <- data.frame(ID=c(1, 1, 1, 2, 3, 3), timeA=c(-10, NA, NA, -15, -10, -5), timeB=c(5, 100, -10, -10, -15, 5), timeC=c(1, 160, 17, -5, -5, 2))
Question 1: I want to create a column giving me the lowest positive value of time for each participant or if all values are negative then keep the negative value in and choose the one that is least negative. Then I want to only choose the lowest positive value for each participant (ID), or when all values are negative, choose the value that is least negative.
Question 2: Is there a function looking for the value that is closest to 0?
So that my output would look like this:
df <- data.frame(ID=c(1,2,3), time_new=c(1, -5, 2))
Upvotes: 0
Views: 259
Reputation: 4487
Simply calculate distance to 0 and then filter
For #1
library(tidyverse)
# function filter check and return a TRUE/FALSE with
# follow logic of #1 - priority positive value first
# if no positive take the maximum negative number
filter_function <- function(x) {
result <- rep(0, length(x))
if (all(x < 0, na.rm = TRUE)) {
reference <- max(x, na.rm = TRUE)
} else {
reference <- min(x[x > 0], na.rm = TRUE)
}
result <- result + (x == reference)
result[is.na(result)] <- 0
as.logical(result)
}
# filter as #1 option
df %>% pivot_longer(!ID,values_to = "value") %>%
# calculate the distance to ZERO for each value
mutate(distance_to_zero = 0 + value,
abs_distance_to_zero = abs(distance_to_zero)) %>%
group_by(ID) %>%
filter(filter_function(distance_to_zero))
#> # A tibble: 3 x 5
#> # Groups: ID [3]
#> ID name value distance_to_zero abs_distance_to_zero
#> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 1 timeC 1 1 1
#> 2 2 timeC -5 -5 5
#> 3 3 timeC 2 2 2
And this is for #2
# filter as closest to ZERO no matter positive or negative
df %>%
pivot_longer(!ID,values_to = "value") %>%
# calculate the distance to ZERO for each value
mutate(abs_distance_to_zero = abs(0 + value)) %>%
group_by(ID) %>%
# Then filter by the one equal to minimum in each group can return multiple
# records in your actual data
filter(abs_distance_to_zero == min(abs_distance_to_zero, na.rm = TRUE) &
!is.na(abs_distance_to_zero)) %>%
ungroup()
#> # A tibble: 3 x 4
#> ID name value abs_distance_to_zero
#> <dbl> <chr> <dbl> <dbl>
#> 1 1 timeC 1 1
#> 2 2 timeC -5 5
#> 3 3 timeC 2 2
Upvotes: 1
Reputation: 18714
I think your looking for Closest()
from the library DescTools
.
library(tidyverse)
library(DescTools)
# your data
df <- data.frame(ID=c(1, 1, 1, 2, 3, 3),
timeA=c(-10, NA, NA, -15, -10, -5),
timeB=c(5, 100, -10, -10, -15, 5),
timeC=c(1, 160, 17, -5, -5, 2))
# your results
# I stacked the information for easier searching
df %>% pivot_longer(!ID,values_to = "value") %>%
group_by(ID) %>%
summarise(time_new = Closest(value, 0, na.rm = T)) # closest value to zero
Upvotes: 2