IzzyBizzy
IzzyBizzy

Reputation: 43

Add subgroup labels/order elements on x-axis in ggplot2 r

I am trying to add sub-group labels and order observations on the x-axis in my . There are multiple questions about this on here already but the responses all recommend using faceting (e.g. here). My plot is already faceted, such that these responses don't work for me. I tried using reorder(x, by_this_variable) but this only seems to work if by_this_variable is the y-axis. Why? If I try to reorder it by a different variable, I receive a warning:

argument is not numeric or boolean

To be more specific, I am plotting two points (percentages by participant obtained in two different tasks) for each discrete x-axis value (1 for each participant) with arrows connecting the dots per participant. This is to indicate whether participant behavior was influenced negatively or positively across tasks. My facets are 2 different (treatment) conditions that participants were randomly sorted into. I would now like to group these dot-arrow graph according to different participant origins (a possible predictor for different responses to the treatment) and add this information as a label on the x-axis, but all I can achieve right now is to have the values sorted alphabetically (the default).

This plot might end up looking too busy. If there is a better way to plot all of this information (relative change of behavior by task, by participant, by condition, by origin) in one graph, I am open for suggestions!

My code:

Data <- data.frame(c(28.5, 20, 55.4, 30.5, 66.6, 45.4, 43.2, 43.1, 28.5, 55.4, 30.5, 
                   66.6, 45.4, 20), c("Participant 1", "Participant 1", 
                   "Participant 2", "Participant 2", "Participant 3", 
                   "Participant 3","Participant 4", "Participant 4","Participant 5", 
                   "Participant 5", "Participant 6", "Participant 6", "Participant 7", 
                   "Participant 7"),c("India", "India", "India", "India", "Algeria", 
                   "Algeria", "Algeria", "Algeria", "India", "India", "India", 
                   "India", "Algeria", "Algeria"),c("Treatment A", "Treatment A", 
                   "Treatment B", "Treatment B","Treatment A", "Treatment A", 
                   "Treatment B", "Treatment B", "Treatment A", "Treatment A", 
                   "Treatment B", "Treatment B", "Treatment A", "Treatment A"),
                   c("Task 1", "Task 2", "Task 1", "Task 2", "Task 1", "Task 2", 
                   "Task 1", "Task 2", "Task 1", "Task 2", "Task 1", "Task 2", 
                   "Task 1", "Task 2"))
colnames(Data) <- c("Percentage", "Participant", "Origin", "Treatment", "Task")

ggplot(Data, aes(y=Percentage, x = Participant, group = Participant))+
   geom_point(aes(color = Task))+ 
   geom_line(arrow = arrow(length=unit(0.30,"cm"), type = "closed"), size = .3)+
   facet_grid(~Treatment, scales = "free_x", space = "free_x")+ 
   theme(axis.text.x = element_text(angle = 90, hjust = 1))

This produces the following plot:

Plot

Participants 1 & 5 are from India and 3 & 7 from Algeria, so I would like to group them together on the x-axis and add a label for origin.

EDIT:

The warning above seems to stem from the fact that Origin is a multi-level factor (and reorder appears to work only with numeric values), thus setting x = reorder(Participant, as.numeric(Origin)) will order the values according to Origin, but how can I add appropriate Origin labels below the plot?

Upvotes: 2

Views: 1739

Answers (1)

Valentin_Ștefan
Valentin_Ștefan

Reputation: 6436

One suggestion is to use an ordered factor. For the levels of the factor concatenate Origin and Participant. For the labels of the factor, concatenate Participant and Origin.

# The unique values from the column 'Origin_Participant' will act as the levels
# of the factor. The order is imposed by 'Origin', so that participants from
# same country group together.
Data$Origin_Participant <- paste(Data$Origin, Data$Participant, sep = "\n")
# The unique values from 'Participant_Origin' column will be used for the
# factor' labels (what will end up on the plot).
Data$Participant_Origin <- paste(Data$Participant, Data$Origin, sep = "\n")
# Order data.frame by 'Origin_Participant'. Is also important so that the levels
# correspond to the labels of the factor when creating it below.
Data <- Data[order(Data$Origin_Participant),]
# Or in decreasing order if you need
# Data <- Data[order(Data$Origin_Participant, decreasing = TRUE),]

# Finally, create the needed factor.
Data$Origin_Participant <- factor(x = Data$Origin_Participant,
                                  levels = unique(Data$Origin_Participant),
                                  labels = unique(Data$Participant_Origin),
                                  ordered = TRUE)

library(ggplot2)
# Reuse your code, but map the factor `Origin_Participant` into x. I think there
# is no need of a grouping factor. I also added vjust = 0.5 to align the labels
# on the vertical center.
ggplot(Data, aes(y=Percentage, x = Origin_Participant))+
  geom_point(aes(color = Task))+ 
  geom_line(arrow = arrow(length=unit(0.30,"cm"), type = "closed"), size = .3)+
  facet_grid(~Treatment, scales = "free_x", space = "free_x")+ 
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))

enter image description here

If you do not care that Origin appears first in the labels, then is few steps shorter:

Data$Origin_Participant <- factor(x = paste(Data$Origin, Data$Participant, sep = "\n"),
                                  ordered = TRUE)
ggplot(Data, aes(y=Percentage, x = Origin_Participant))+
  geom_point(aes(color = Task))+ 
  geom_line(arrow = arrow(length=unit(0.30,"cm"), type = "closed"), size = .3)+
  facet_grid(~Treatment, scales = "free_x", space = "free_x")+ 
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))

enter image description here

Upvotes: 0

Related Questions