Reputation: 29
I hope someone can help me with the following problem: I would like to display the value (avg) of different laboratory parameters (parameter) of 2 different groups (gruppe). Additionally, I want to plot this information according to the change over time (performance) in 3 different facets. Here a tibble of the dataset:
# A tibble: 402 x 4
# Groups: gruppe, parameter [134]
gruppe parameter performance avg
<chr> <chr> <chr> <dbl>
1 DGE ACPA(citrull. Prot.-Ak) EIA/Se change_t1t0 NaN
2 DGE ACPA(citrull. Prot.-Ak) EIA/Se change_t2t0 37.6
3 DGE ACPA(citrull. Prot.-Ak) EIA/Se change_t3t0 NaN
4 Fasten Apolipoprot. A1 HP change_t1t0 41.2
5 DGE Apolipoprot. A1 HP change_t2t0 NaN
6 DGE Apolipoprot. A1 HP change_t3t0 NaN
7 DGE Apolipoprotein B change_t1t0 NaN
8 DGE Apolipoprotein B change_t2t0 NaN
9 Fasten Apolipoprotein B change_t3t0 NaN
10 DGE aPTT Pathromtin SL change_t1t0 0.571
# … with 392 more rows
This worked totally fine using this code:
#Create labels for 3 facets
lab_labels <- c("Change from Baseline to Day 7 [%]",
"Change from Baseline to Week 6 [%]",
"Change from Baseline to Week 12 [%]")
names(lab_labels) <- c("change_t1t0",
"change_t2t0",
"change_t3t0")
labor_summ_long %>%
filter(parameter %in% c("Hämatokrit (l/l)","Hämoglobin", "Leukozyten","MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten","MPV")) %>%
arrange(desc(avg))%>%
group_by(gruppe, performance)%>%
ggplot(aes(x=reorder(parameter,avg), y=avg, group=gruppe, fill = gruppe))+
geom_col(position = position_dodge())+
facet_wrap(~performance,
scales ="free_y",
dir="v",
labeller = labeller(performance = lab_labels))+
ylab("") +
xlab("") +
labs(color="", linetype="")+
theme_pubclean()+
theme(strip.background=element_rect(fill="lightgrey"),
strip.text = element_text(face="bold"),
legend.position = "bottom",
legend.title=element_blank())+
theme(axis.text.x = element_text(angle=45, hjust=1, vjust = 1))+
scale_x_discrete(labels = c("Hämoglobin"="Hemoglobin", "Leukozyten" = "Leucocytes",
"MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten"="Thrombocytes",
"MPV", "Hämatokrit (l/l)"="Hematocrite"))+
scale_fill_discrete(labels=c('DGE', "Fasten"='Fasting'))
This is how the plot looks like
What I am missing and am failing to find the solution to: I would like to order the bars...
I tricked around with arrange, sort, etc. but couldn't get all the conditions above together.
Do you have any ideas? Thanks a lot in advance!
Upvotes: 1
Views: 120
Reputation: 124393
The issue is that reorder
reorders by taking the mean of all values for each parameter
without taking account of any grouping.
Adapting this answer to your case and making use of some random example data to mimic your real data this could be achieved like so:
The helper function reorder_where
allows to order the categories by an additional condition, e.g. in your case where gruppe == "Fasten" & performance == "change_t1t0"
is TRUE
library(dplyr)
library(ggplot2)
reorder_where <- function (x, by, where, fun = mean, ...) {
xx <- x[where]
byby <- by[where]
byby <- tapply(byby, xx, FUN = fun, ...)[x]
reorder(x, byby)
}
labor_summ_long %>%
filter(parameter %in% c("Hämatokrit (l/l)","Hämoglobin", "Leukozyten","MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten","MPV")) %>%
ggplot(aes(x=reorder_where(parameter, -avg, gruppe == "Fasten" & performance == "change_t1t0"), y=avg, group=gruppe, fill = gruppe))+
geom_col(position = position_dodge())+
facet_wrap(~performance,
scales ="free_y",
dir="v",
labeller = labeller(performance = lab_labels))+
ylab("") +
xlab("") +
labs(color="", linetype="")+
#theme_pubclean()+
theme(strip.background=element_rect(fill="lightgrey"),
strip.text = element_text(face="bold"),
legend.position = "bottom",
legend.title=element_blank())+
theme(axis.text.x = element_text(angle=45, hjust=1, vjust = 1))+
scale_x_discrete(labels = c("Hämoglobin"="Hemoglobin", "Leukozyten" = "Leucocytes",
"MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten"="Thrombocytes",
"MPV", "Hämatokrit (l/l)"="Hematocrite"))+
scale_fill_discrete(labels=c('DGE', "Fasten"='Fasting'))
DATA
set.seed(42)
labor_summ_long <- data.frame(
parameter = sample(c("Hämatokrit (l/l)","Hämoglobin", "Leukozyten","MCV", "MCH", "MCHC", "RDW-CV", "Thromobzyten","MPV"), 100, replace = TRUE),
gruppe = sample(c("DGE", "Fasten"), 100, replace = TRUE),
performance = sample(c("change_t1t0",
"change_t2t0",
"change_t3t0"), 100, replace = TRUE),
avg = runif(100, 0, 50)
)
labor_summ_long <- dplyr::distinct(labor_summ_long, parameter, gruppe, performance, .keep_all = TRUE)
Upvotes: 1