XFrost
XFrost

Reputation: 119

How to arrange subgroups in a grouped barplot in R in ascending number in R

I have a dataset that I am trying to create a grouped bar chart. The groups are before and after timepoints, and the subgroups are 6 different devices. The before only has 3 of the 6 subgroups, while the after has all 6 of the subgroups. I am having two main issues: getting the code to arrange the subgroups in ascending order, and having the "Before" group on the left of the plot compared to the "After" group.

Below is the code that I have for dataset FDA_co_tier:

library(tidyverse)
library(ggplot2)

F_dev_dec <- FDA_co_tier%>%
  select(device_type, year_approved) %>% 
  mutate(year_2015 = if_else(year_approved >= 2015, "after", "before")) %>% 
  mutate(year_2015 = factor(year_2015, levels=c("before", "after"))) %>% 
  group_by(device_type, year_2015) %>% 
  summarise(N=n()) %>% 
  mutate(N= factor(N, levels = N))

which gives me a table of:

`summarise()` has grouped output by 'device_type'. You can override using the `.groups` argument.
# A tibble: 9 x 3
# Groups:   device_type [6]
  device_type         year_2015 N    
  <chr>               <fct>     <fct>
1 Accessories         after     6    
2 Aspiration_catheter before    4    
3 Aspiration_catheter after     32   
4 Guidewire           before    3    
5 Guidewire           after     23   
6 Microcatheter       after     7    
7 Sheath              after     19   
8 Stentretriever      before    17   
9 Stentretriever      after     22 

I made both the year_2015 variable and the N variable as factors, and tried to set the year_2015 levels with "before" then "after".

As mentioned above, I want to plot the different devices in Bars for the "Before" and "After" points in ascending order of n. To do this, I initially tried:

F_dev_dec %>% 
  ggplot(aes(x= year_2015, fill= device_type)) +
  geom_bar(position = position_dodge()) +
      labs(title = NULL,
           x= NULL,
           y= "Count (n)")

which gave me this graph: [1]: https://i.sstatic.net/NAM8i.png

Not sure exactly why this has the correct groups ("Before", "After" but all the device subgroups go to 1, like its calculating some odd percentage. I

I next tried to adjust the code and not summarize and not double group the data table by device and year_2015:

F_dev_dec <- FDA_co_tier%>%
  select(device_type, year_approved) %>% 
  mutate(year_2015 = if_else(year_approved >= 2015, "after", "before")) %>% 
  mutate(year_2015 = factor(year_2015, levels=c("before", "after"))) %>% 
  group_by(device_type, year_2015)  

then using the same code for the ggplot above, I get something a little closer, but not in ascending order of n for the subgroups : https://i.sstatic.net/wSm5L.png

After further attempts of trial and error not getting this, the only way Ive been able to get the plot to look like I want was with this code:

F_dev_dec <- FDA_co_tier%>%
  select(device_type, year_approved) %>% 
  mutate(year_2015 = if_else(year_approved >= 2015, "before", "after")) %>% 
  mutate(year_2015 = factor(year_2015, levels=c("before", "after"))) %>% 
  group_by(device_type, year_2015) %>% 
  summarise(N=n())

`summarise()` has grouped output by 'device_type'. You can override using the `.groups` argument.
# A tibble: 9 x 3
# Groups:   device_type [6]
  device_type         year_2015     N
  <chr>               <fct>     <int>
1 Accessories         after         6
2 Aspiration_catheter before        4
3 Aspiration_catheter after        32
4 Guidewire           before        3
5 Guidewire           after        23
6 Microcatheter       after         7
7 Sheath              after        19
8 Stentretriever      before       17
9 Stentretriever      after        22

F_dev_dec %>% 
  ggplot(aes(x= year_2015, y= N, fill= reorder(device_type, N))) +
  geom_bar(data=F_dev_dec %>% filter(year_2015 == "before"), width=0.9, position = position_dodge(),  stat = "identity") +
  geom_bar(data=F_dev_dec %>% filter(year_2015 == "after"), width=0.5, position = position_dodge(),  stat = "identity") +
  scale_x_discrete(labels = c("Before", "After")) +
  theme_classic()+
  labs(title = NULL,
       x= NULL,
       y= "Count (n)")

Which gives the Plot im looking for: https://i.sstatic.net/KHfzk.png

Of note: in the code immediately above, the mutation of FDA_co_tier to generate the variable year_2015 keeps messing up the final plot. When I attempt do the mutation correctly with mutate(year_2015 = if_else(year_approved >= 2015, "after", "before")), the final plot looks like this: https://i.sstatic.net/G4vyU.png

The data table gets reversed (values that should be "before" are labeled as "after" 2015 and values "after" are labeled as "before"). Its more annoying that anything as the groups all still contain the same values, but the labels are wrong, so I had to write in the scale_x_discrete(labels = c("Before", "After")) line to correct the plot.

Any advice on how to fix the issue with getting the subgroups in correct ascending number and whats going on with the code to generate the plot would be much appreciated!

Upvotes: 1

Views: 1100

Answers (1)

Marek Fiołka
Marek Fiołka

Reputation: 4949

You need to set the appropriate levels for the device_type variable. The easiest way to do it is this way.

library(tidyverse)

F_dev_dec = read.table(
  header = TRUE,text="
device_type         year_2015 N    
 Accessories         after     6    
 Aspiration_catheter before    4    
 Aspiration_catheter after     32   
 Guidewire           before    3    
 Guidewire           after     23   
 Microcatheter       after     7    
 Sheath              after     19   
 Stentretriever      before    17   
 Stentretriever      after     22 
") %>% as_tibble()

F_dev_dec = F_dev_dec %>% 
  group_by(year_2015) %>% 
  arrange(N) %>% 
  mutate(
  year_2015 = year_2015 %>% factor(c("before", "after")),
  device_type = device_type %>% fct_inorder()
)

F_dev_dec %>% 
  ggplot(aes(x= year_2015, y= N, fill= reorder(device_type, N))) +
  geom_bar(data=F_dev_dec %>% filter(year_2015 == "before"), width=0.9, position = position_dodge(),  stat = "identity") +
  geom_bar(data=F_dev_dec %>% filter(year_2015 == "after"), width=0.5, position = position_dodge(),  stat = "identity") +
  scale_x_discrete(labels = c("Before", "After")) +
  theme_classic()+
  labs(title = NULL,
       x= NULL,
       y= "Count (n)")

enter image description here

Note the following order of commands first arrange(N) and then device_type = device_type %>% fct_inorder ().

P.S. You may have shared your FDA_co_tier data table with us. Without it, I had to read the summary using read.table.

Update 1

Phew! I got a bit tired to get the effect you expect. But in the end it worked. Let's see what we have here. First, I will work on your completed data. I made them the same for myself.

library(tidyverse)

FDA_co_tier = tibble(
  device_type = c(rep("Accessories", 6),
                  rep("Aspiration_catheter", 36),
                  rep("Guidewire", 26),
                  rep("Microcatheter", 7),
                  rep("Sheath", 19),
                  rep("Stentretriever", 39)),
  year_2015 = c(rep("after", 6),
                rep("before", 4),
                rep("after", 32),
                rep("before", 3),
                rep("after", 49),
                rep("before", 17),
                rep("after", 22)))

output

# A tibble: 133 x 2
   device_type         year_2015
   <chr>               <chr>    
 1 Accessories         after    
 2 Accessories         after    
 3 Accessories         after    
 4 Accessories         after    
 5 Accessories         after    
 6 Accessories         after    
 7 Aspiration_catheter before   
 8 Aspiration_catheter before   
 9 Aspiration_catheter before   
10 Aspiration_catheter before   
# ... with 123 more rows

Now let's create a plot. Note one clever line of code geom_bar (position = position_dodge (), alpha = 0)), it does not display anything, it only sets the expected order of the variable year_2015.

FDA_co_tier %>% 
  mutate(
    year_2015 = year_2015 %>% factor(c("before", "after")))  %>% 
  ggplot(aes(year_2015, fill=device_type))+
  geom_bar(position = position_dodge(), alpha=0)+
  geom_bar(data = . %>% 
             filter(year_2015 == "before") %>% 
             mutate(device_type = device_type %>% fct_infreq() %>% fct_rev()), 
           position = position_dodge())+
  geom_bar(data = . %>% 
             filter(year_2015 == "after") %>% 
             mutate(device_type = device_type %>% fct_infreq() %>% fct_rev()),
           position = position_dodge())

enter image description here

Upvotes: 2

Related Questions