Md. Ahsan
Md. Ahsan

Reputation: 23

Sorting ggplot barplot x-axis in descending order (only one variable on the x-axis)

I have a big dataframe (AKA "all_data") of words that were presented as an Arabic audio to participants who then had to choose from four options what they thought the word meant in English. Their choice is recorded in the choice column, and what the correct answer should have been is in a different one:

Preview of my screen

Anyway, I wanted to kind of add a column to this "all_data" df that shows their most frequent response even if it wasn't the target word, but I couldn't figure out how to do that. So, moving on, I atleast wanted to visualize it so I created barplots so I could individually check each of the 100 words to see what people thought this word sounded like--so I filtered the word in question (doing this for each of the 100 words is tedious I know) and then made the x-axis the "Choice" column as you can see in the code below:

ggplot(filter(all_data, Correct == "Stormy"), aes((Choice))) + geom_bar()

This leads to the bar plot you see on the bottom right of the print screen I showed.

I tried several things to to rearrange the x-axis in order of frequency but nothing worked! I have searched on all the other threads similar to this question but they all have a y-axis to use in the reordering, which I do not have and therefore I always get errors.

I understand there is technically no defined y-axis so R kind of creates its own 'count' of the words on the x-axis but I can't seem to figure out how to get the name for this count in the y-axis as you can see in my so many attempts above.

Anyway after all that I've just been viewing my plots in the normal way and making notes but is there anything anyone can do to help with any of my problems?

Much appreciated!

Ahsan

Upvotes: 2

Views: 5924

Answers (2)

stefan
stefan

Reputation: 125268

The most easiest way is to compute the frequency manually e.g. via count or group_by+ summarise and use the aggregated df for plotting using geom_col instead of geom_bar. Then you can reorder Choice via e.g. forcats::fct_reorder or reorder according to frequency.

Using some fake random data to mimic your dataset:

library(ggplot2)
library(dplyr, warn = FALSE)
library(forcats)

set.seed(42)

all_data <- data.frame(
  Choice = sample(c("Painful", "Shook", "Humilation"), 30, replace = TRUE),
  Correct = sample(c("Stormy", "Truth"), 30, replace = TRUE)
)

# Aggregate your data outside of ggplot
all_data %>%
  filter(Correct == "Stormy") %>%
  count(Choice) %>%
  ggplot(aes(reorder(Choice, -n), n)) +
  geom_col()

And if you don't want to aggregate your data manually a second option would be to use forcats::fct_infreq in geom_bar:

# Or using forcats::fct_infreq
all_data |>
  filter(Correct == "Stormy") |>
  ggplot() +
  geom_bar(aes(x = forcats::fct_infreq(Choice)))

Upvotes: 1

Md. Ahsan
Md. Ahsan

Reputation: 23

Thank you all.

The above answers would probably work really well, but I figured out the easiest way would probably be to:

install.packages("forcats")
library(forcats)
ggplot(filter(all_data, Correct == "Rage"), aes(fct_infreq(factor(Choice)))) + geom_bar()

Thank you so much all for all your help!

Upvotes: 0

Related Questions