Reputation: 35
I have the following pre-summarized cost data:
MeanCost | Std | MedianCost | LowerIQR | UpperIQR | StatusGroup | AgeGroup |
---|---|---|---|---|---|---|
700 | 500 | 650 | 510 | 780 | Dead | Young |
800 | 600 | 810 | 666 | 1000 | Alive | Young |
500 | 200 | 657 | 450 | 890 | Comatose | Young |
300 | 400 | 560 | 467 | 670 | Dead | Old |
570 | 600 | 500 | 450 | 600 | Alive | Old |
555 | 500 | 677 | 475 | 780 | Comatose | Old |
333 | 455 | 300 | 200 | 400 | Dead | Middle |
678 | 256 | 600 | 445 | 787 | Alive | Middle |
1500 | 877 | 980 | 870 | 1200 | Comatose | Middle |
I wish to create a boxplot with this information - similar to the one below.
Where each Color represents Status Group (blue=dead, read=alive, green=comatose).
And each grouped cluster represents an age group (left cluster=young, middle cluster=middle, right cluster=old).
I know that I don't have min and max, so whiskers are not necessary.
I want to code this in R, and any help would be appreciated! Thank you.
Here is the code I have tried:
dattest<- data.frame(
Mean_Cost = c(700,800,500,300,570,555,333,678,1500),
Std = c(500,600,200,400,600,500,455,256,877),
Median_Cost = c(650,810,657,560,500,677,300,600,980),
LowerIQR = c(510,666,450,467,450,475,200,445,870),
UpperIQR = c(780,1000,890,670,600,780,400,787,1200),
StatusGroup = c(1,2,3,1,2,3,1,2,3),
AgeGroup = c(1,1,1,2,2,2,3,3,3))
where for StatusGroup 1=dead, 2=alive, 3-comatose
and for AgeGroup 1=young, 2=old, 3=middle
ggplot(dattest, aes(xmin = AgeGroup-.25, xmax=AgeGroup+.25, ymin=LowerIQR, ymax=UpperIQR)) +
geom_rect(fill="transparent", col = "blue") +
geom_segment(aes(y=Median_Cost, yend=Median_Cost, x=AgeGroup-.25, xend=AgeGroup+.25), col="blue") +
geom_point(mapping=aes(x = StatusGroup, y = Mean_Cost), col="red") +
scale_x_continuous(breaks=1:3, labels=c("Young","Old","Middle")) +
theme_classic()
And this code is definitely not giving me what I want
Upvotes: 1
Views: 1102
Reputation: 81
Is this what you are trying to do?
library(tidyverse)
df <- tibble::tribble(
~MeanCost, ~Std, ~MedianCost, ~LowerIQR, ~UpperIQR, ~StatusGroup, ~AgeGroup,
700L, 500L, 650L, 510L, 780L, "Dead", "Young",
800L, 600L, 810L, 666L, 1000L, "Alive", "Young",
500L, 200L, 657L, 450L, 890L, "Comatose", "Young",
300L, 400L, 560L, 467L, 670L, "Dead", "Old",
570L, 600L, 500L, 450L, 600L, "Alive", "Old",
555L, 500L, 677L, 475L, 780L, "Comatose", "Old",
333L, 455L, 300L, 200L, 400L, "Dead", "Middle",
678L, 256L, 600L, 445L, 787L, "Alive", "Middle",
1500L, 877L, 980L, 870L, 1200L, "Comatose", "Middle"
)
df %>%
mutate(AgeGroup = factor(AgeGroup, levels = c("Young", "Middle", "Old"))) %>%
ggplot(aes(x = AgeGroup, fill = StatusGroup)) +
geom_boxplot(aes(
lower = LowerIQR,
upper = UpperIQR,
middle = MedianCost,
ymin = MedianCost - Std,
ymax = MedianCost + Std),
stat = "identity", width = 0.5)
To add an "x" at the mean you can adjust the position:
df %>%
mutate(AgeGroup = factor(AgeGroup, levels = c("Young", "Middle", "Old"))) %>%
ggplot(aes(x = AgeGroup, fill = StatusGroup)) +
geom_boxplot(aes(
lower = LowerIQR,
upper = UpperIQR,
middle = MedianCost,
ymin = MedianCost - Std,
ymax = MedianCost + Std),
stat = "identity", width = 0.5) +
geom_point(aes(y = MeanCost),
position = position_dodge(width = 0.5),
shape = 4)
Upvotes: 3