Reputation: 1644
I have a data on individuals with their titre levels. The titre levels are in ordered categories. Instead of using the labelled values, I have used the numeric values - this would allow me to see the trend. My logic as follows:
However, I want to relax the criteria to allow for 1 "deviation" or "difference". Example (picture below), id==9
would be considered "fluctuate" in the original logic, but I want this to be "stable" since there's only a single value that's difference from the rest. Similarly for id %in% 10:12
.
This is what I have done previously, using dplyr
and tidyr
, but I am pretty much stuck.
library(tidyverse)
df %>%
# Create a duplicate column to convert into long format
mutate(titre_level_sep = titre_level) %>%
separate_rows(titre_level_sep, sep = ",") %>%
# Calculate for each id
group_by(id) %>%
mutate(
# Trends and diff in titre levels
diff = as.numeric(titre_level_sep) - lag(as.numeric(titre_level_sep)),
increasing = all(diff >= 0, na.rm = TRUE),
decreasing = all(diff <= 0, na.rm = TRUE),
stable = all(diff == 0, na.rm = TRUE)) %>%
ungroup() %>%
select(everything(), -c(diff, titre_level_sep)) %>%
distinct() %>%
mutate(
# Trends final classification
trend2 = case_when(increasing & !decreasing & !stable ~ "uptrend",
!increasing & decreasing & !stable ~ "downtrend",
increasing & decreasing & stable ~ "stable",
!increasing & !decreasing & !stable ~ "fluctuate")) %>%
select(id, trend, trend2)
Here's the data I have used:
df <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
titre_level = c("2,1,1,1,1", "4,4,4,4,4,1,1,1,1,1", "8,6,6,6,6,4",
"1,1,1,1,1,1,1,1,1,3,3", "4,4,7,10", "8,11", "1,1", "6,6,6,6,6,6,6,6,6",
"1,2,1,1,1,1,1,1,1", "8,8,6,8", "4,2,2,2,4,2,2,2,1", "1,1,2,4,4,1"
), trend = c("downtrend", "downtrend", "downtrend", "uptrend",
"uptrend", "uptrend", "stable", "stable", "stable", "stable",
"downtrend", "uptrend")), class = "data.frame", row.names = c(NA,
12L))
Upvotes: 1
Views: 48
Reputation: 12528
df %>%
mutate(trend = map_chr(titre_level, \(s) {
s %>% str_split(",", simplify = TRUE) %>%
{case_when(
all(diff(as.numeric(.)) == 0, na.rm = TRUE) ~ "stable",
all(diff(as.numeric(.)) >= 0, na.rm = TRUE) ~ "uptrend",
all(diff(as.numeric(.)) <= 0, na.rm = TRUE) ~ "downtrend",
length(table(.)) == 2 & min(table(.)) == 1 ~ "deviation", # if there are two unique values in the titre level, and one of them appears only once, then it's a "deviation"
TRUE ~ "flutuate")}
}))
# A tibble: 12 × 3
id titre_level trend
<dbl> <chr> <chr>
1 1 2,1,1,1,1 downtrend
2 2 4,4,4,4,4,1,1,1,1,1 downtrend
3 3 8,6,6,6,6,4 downtrend
4 4 1,1,1,1,1,1,1,1,1,3,3 uptrend
5 5 4,4,7,10 uptrend
6 6 8,11 uptrend
7 7 1,1 stable
8 8 6,6,6,6,6,6,6,6,6 stable
9 9 1,2,1,1,1,1,1,1,1 deviation
10 10 8,8,6,8 deviation
11 11 4,2,2,2,4,2,2,2,1 flutuate
12 12 1,1,2,4,4,1 flutuate
Upvotes: 1