wantingtoimprove
wantingtoimprove

Reputation: 27

Grouped bar chart in R formatting

I have this data.table:

     dataset            variable       ARI
 1:    pcaZ0     pearson, single 0.6984690
 2:    pcaZ0   pearson, complete 0.6984690
 3:    pcaZ0    pearson, average 0.6984690
 4:    pcaZ0    spearman, single 0.6984690
 5:    pcaZ0  spearman, complete 0.6984690
 6:    pcaZ0   spearman, average 0.6984690
 7:    pcaZ0      cosine, single 0.7238611
 8:    pcaZ0    cosine, complete 0.7238611
 9:    pcaZ0     cosine, average 0.7109783
10:    pcaZ0   euclidean, single 0.5177371
11:    pcaZ0 euclidean, complete 0.5177371
12:    pcaZ0  euclidean, average 0.5177371
13:    pcaZ1     pearson, single 0.5429425
14:    pcaZ1   pearson, complete 0.9619119
15:    pcaZ1    pearson, average 0.5429425
16:    pcaZ1    spearman, single 0.5317401
17:    pcaZ1  spearman, complete 0.5317401
18:    pcaZ1   spearman, average 0.7371173
19:    pcaZ1      cosine, single 0.5220314
20:    pcaZ1    cosine, complete 0.9434279
21:    pcaZ1     cosine, average 0.8089993
22:    pcaZ1   euclidean, single 0.5177371
23:    pcaZ1 euclidean, complete 0.5177371
24:    pcaZ1  euclidean, average 0.5177371
25: modpcaZ0     pearson, single 0.5251420
26: modpcaZ0   pearson, complete 0.8485167
27: modpcaZ0    pearson, average 0.9596045
28: modpcaZ0    spearman, single 0.5251420
29: modpcaZ0  spearman, complete 0.8485167
30: modpcaZ0   spearman, average 0.8838628
31: modpcaZ0      cosine, single 0.5083105
32: modpcaZ0    cosine, complete 0.9596045
33: modpcaZ0     cosine, average 0.9596045
34: modpcaZ0   euclidean, single 0.5203030
35: modpcaZ0 euclidean, complete 0.6717862
36: modpcaZ0  euclidean, average 0.5360825
37: modpcaZ1     pearson, single 0.5360825
38: modpcaZ1   pearson, complete 0.8485167
39: modpcaZ1    pearson, average 0.8838628
40: modpcaZ1    spearman, single 0.5630128
41: modpcaZ1  spearman, complete 0.8485167
42: modpcaZ1   spearman, average 0.8485167
43: modpcaZ1      cosine, single 0.5360825
44: modpcaZ1    cosine, complete 0.8314749
45: modpcaZ1     cosine, average 0.9400379
46: modpcaZ1   euclidean, single 0.5360825
47: modpcaZ1 euclidean, complete 0.7239638
48: modpcaZ1  euclidean, average 0.5487061

Using the following code,

library(tidyverse)
library(ggplot2)
library(gridExtra)
library(ggtext)
library(khroma)

# grouped bar charts
ggplot(b2, aes(y=variable, x=ARI, fill=dataset)) + 
geom_col(position=position_dodge(), width=0.6) +
scale_fill_manual(name=NULL,
                  breaks=c("pcaZ0", "pcaZ1", "modpcaZ0", "modpcaZ1"),
                  labels=c("non-standard.", 
                           "standard.", 
                           "non-standard.<br>*without 71-76*",
                           "standard.<br>*without 71-76*"),
                  values=c(mypal2)) +
scale_x_continuous(expand=c(0, 0)) +
labs(y=NULL,
     x="adjusted Rand index") +
theme_classic() +
theme(axis.text.x=element_markdown(),
      legend.text=element_markdown(),
      legend.position=c(0.9, 0.9),
      panel.grid.major.x=element_line(color="lightgray", size=0.25))

I have produced the following grouped bar chart: enter image description here

I would like to group the variables by linkage method and instead of "pearson, single" along the same line, I would like to have the proximity metric (pearson) on one line and the linkage method (single) on the line below it italicized. Can I make changes to the graph code for this or must I make changes to my data.table? If it is the latter, how can I make such a change to my data so that it formats my graph the way I want it to be formatted.

Thank you for your help.

Edit 1 Here is the code for mypal2:

mypal <- colour("okabeito")(8)
mypal <- mypal[c(2:8, 1)]
names(mypal) <- NULL

mypal2 <- mypal[-c(2, 4, 6)]

palette(mypal2)

Edit 2 Added the libraries used.

Upvotes: 0

Views: 35

Answers (1)

TarJae
TarJae

Reputation: 78917

With some preprocessing of the data using separate from tidyr package in combination to some basic markdown syntax we could do:

library(tidyverse)
library(ggtext)
library(khroma)

df %>%
  separate(variable, c("proximity_metric", "linkage_method"), sep = ", ") %>%
  mutate(my_ylabel = paste0(proximity_metric, "<br>", "*", linkage_method, "*")) %>%
  ggplot(aes(y=my_ylabel, x=ARI, fill=dataset)) + 
  geom_col(position=position_dodge(), width=0.6) +
  scale_fill_manual(name=NULL,
                    breaks=c("pcaZ0", "pcaZ1", "modpcaZ0", "modpcaZ1"),
                    labels=c("non-standard.", 
                             "standard.", 
                             "non-standard.<br>*without 71-76*",
                             "standard.<br>*without 71-76*"),
                    values=c(mypal2)) +
  scale_x_continuous(expand=c(0, 0)) +
  labs(y=NULL,
       x="adjusted Rand index") +
  theme_classic() +
  theme(axis.text.y = element_markdown())

enter image description here

Upvotes: 1

Related Questions