Bailey
Bailey

Reputation: 131

Grouped Barchart y-Axis -- ggplot

I am working on a grouped barchart that will hopefully show the range of final grades of students by their number of hours of studying over three different exams.

Here is what I've managed to build so far. Ignore the y axis and the actual bars as they are incorrect. What I want is for the y axis to be the number of students who have gotten a given grade with a given number of study hours.

enter image description here

And here is the code I used:

ggplot(P1, aes (x=Hours,y=Bins)) +
  geom_bar(aes(fill= Bins),stat="identity", position= "dodge")+
  facet_grid(Exam~.)+
  scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs", "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) + 
  labs (x= "Hours Spent Studying", y="# of Students")

I'm pretty sure the problem is with my data input, which looks like this:

ID Exam     Hours    Bins
1 S001    1   0-1 hrs  61-70%
2 S002    1   4-5 hrs  51-60%
3 S003    1 12-13 hrs  51-60%
4 S004    1   6-7 hrs 91-100%
5 S005    1   6-7 hrs  81-90%
6 S006    1 12-13 hrs  61-70%

I think I need to add a "Count" column to act as the y axis. However, I'm confused as I didn't need to do this when making a normal non-grouped bar chart. Is this the solution? And if so, how do I add a count column?

I am a beginner at R.

Here is a dput of my dataframe:

> dput(P1)
structure(list(ID = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 
22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 
35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 
48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 
61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 
74L, 75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 
87L, 88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 
29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 
42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 
55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 
68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 80L, 
81L, 82L, 83L, 84L, 85L, 86L, 87L, 88L, 89L, 90L, 91L, 92L, 93L, 
94L, 95L, 96L, 97L, 98L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 
36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 
49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 
62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 
75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 
88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L), .Label = c("S001", 
"S002", "S003", "S004", "S005", "S006", "S007", "S008", "S009", 
"S010", "S011", "S012", "S013", "S014", "S015", "S016", "S017", 
"S018", "S019", "S020", "S021", "S022", "S023", "S024", "S025", 
"S026", "S027", "S028", "S029", "S030", "S031", "S032", "S033", 
"S034", "S035", "S036", "S037", "S038", "S039", "S040", "S041", 
"S042", "S043", "S044", "S045", "S046", "S047", "S048", "S049", 
"S050", "S051", "S052", "S053", "S054", "S055", "S056", "S057", 
"S058", "S059", "S060", "S061", "S062", "S063", "S064", "S065", 
"S066", "S067", "S068", "S069", "S070", "S071", "S072", "S073", 
"S074", "S075", "S076", "S077", "S078", "S079", "S080", "S081", 
"S082", "S083", "S084", "S085", "S086", "S087", "S088", "S089", 
"S090", "S091", "S092", "S093", "S094", "S095", "S096", "S097", 
"S098"), class = "factor"), Exam = c("1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3"), Hours = c("0-1 hrs", "4-5 hrs", "12-13 hrs", 
"6-7 hrs", "6-7 hrs", "12-13 hrs", "14+ hrs", "6-7 hrs", "2-3 hrs", 
"6-7 hrs", "4-5 hrs", "8-9 hrs", "6-7 hrs", "12-13 hrs", "2-3 hrs", 
"12-13 hrs", "2-3 hrs", "4-5 hrs", "10-11 hrs", "0-1 hrs", "4-5 hrs", 
"10-11 hrs", "4-5 hrs", "8-9 hrs", "0-1 hrs", "12-13 hrs", "2-3 hrs", 
"6-7 hrs", "6-7 hrs", "10-11 hrs", "10-11 hrs", "6-7 hrs", "10-11 hrs", 
"12-13 hrs", "6-7 hrs", "6-7 hrs", "14+ hrs", "2-3 hrs", "4-5 hrs", 
"6-7 hrs", "4-5 hrs", "4-5 hrs", "8-9 hrs", "8-9 hrs", "2-3 hrs", 
"14+ hrs", "2-3 hrs", "2-3 hrs", "8-9 hrs", "8-9 hrs", "6-7 hrs", 
"14+ hrs", "8-9 hrs", "10-11 hrs", "10-11 hrs", "8-9 hrs", "2-3 hrs", 
"8-9 hrs", "8-9 hrs", "4-5 hrs", "2-3 hrs", "4-5 hrs", "2-3 hrs", 
"4-5 hrs", "4-5 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", 
"4-5 hrs", "8-9 hrs", "14+ hrs", "0-1 hrs", "4-5 hrs", "10-11 hrs", 
"4-5 hrs", "4-5 hrs", "8-9 hrs", "4-5 hrs", "12-13 hrs", "4-5 hrs", 
"6-7 hrs", "8-9 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", "6-7 hrs", 
"4-5 hrs", "10-11 hrs", "4-5 hrs", "4-5 hrs", "10-11 hrs", "12-13 hrs", 
"4-5 hrs", "6-7 hrs", "4-5 hrs", "4-5 hrs", "12-13 hrs", "2-3 hrs", 
"6-7 hrs", "10-11 hrs", "6-7 hrs", "8-9 hrs", "6-7 hrs", "8-9 hrs", 
"2-3 hrs", "4-5 hrs", "8-9 hrs", "4-5 hrs", "4-5 hrs", "6-7 hrs", 
"6-7 hrs", "6-7 hrs", "14+ hrs", "4-5 hrs", "10-11 hrs", "6-7 hrs", 
"0-1 hrs", "4-5 hrs", "12-13 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", 
"12-13 hrs", "2-3 hrs", "4-5 hrs", "10-11 hrs", "8-9 hrs", "4-5 hrs", 
"8-9 hrs", "8-9 hrs", "14+ hrs", "8-9 hrs", "8-9 hrs", "4-5 hrs", 
"0-1 hrs", "4-5 hrs", "6-7 hrs", "4-5 hrs", "6-7 hrs", "8-9 hrs", 
"8-9 hrs", "2-3 hrs", "12-13 hrs", "4-5 hrs", "2-3 hrs", "6-7 hrs", 
"10-11 hrs", "6-7 hrs", "12-13 hrs", "8-9 hrs", "8-9 hrs", "10-11 hrs", 
"8-9 hrs", "2-3 hrs", "8-9 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", 
"4-5 hrs", "2-3 hrs", "4-5 hrs", "8-9 hrs", "6-7 hrs", "4-5 hrs", 
"8-9 hrs", "8-9 hrs", "2-3 hrs", "10-11 hrs", "10-11 hrs", "0-1 hrs", 
"2-3 hrs", "6-7 hrs", "6-7 hrs", "4-5 hrs", "8-9 hrs", "2-3 hrs", 
"12-13 hrs", "0-1 hrs", "4-5 hrs", "6-7 hrs", "2-3 hrs", "2-3 hrs", 
"8-9 hrs", "6-7 hrs", "2-3 hrs", "2-3 hrs", "2-3 hrs", "4-5 hrs", 
"8-9 hrs", "6-7 hrs", "10-11 hrs", "8-9 hrs", "4-5 hrs", "6-7 hrs", 
"12-13 hrs", "2-3 hrs", "4-5 hrs", "14+ hrs", "6-7 hrs", "8-9 hrs", 
"6-7 hrs", "14+ hrs", "10-11 hrs", "4-5 hrs", "4-5 hrs", "6-7 hrs", 
"6-7 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", "14+ hrs", "6-7 hrs", 
"6-7 hrs", "2-3 hrs", "2-3 hrs", "2-3 hrs", "14+ hrs", "4-5 hrs", 
"6-7 hrs", "0-1 hrs", "12-13 hrs", "2-3 hrs", "10-11 hrs", "10-11 hrs", 
"4-5 hrs", "6-7 hrs", "6-7 hrs", "10-11 hrs", "14+ hrs", "10-11 hrs", 
"6-7 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", "14+ hrs", "4-5 hrs", 
"14+ hrs", "4-5 hrs", "10-11 hrs", "4-5 hrs", "8-9 hrs", "6-7 hrs", 
"4-5 hrs", "10-11 hrs", "6-7 hrs", "12-13 hrs", "14+ hrs", "6-7 hrs", 
"6-7 hrs", "10-11 hrs", "10-11 hrs", "4-5 hrs", "6-7 hrs", "10-11 hrs", 
"4-5 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs", 
"4-5 hrs", "6-7 hrs", "4-5 hrs", "6-7 hrs", "14+ hrs", "6-7 hrs", 
"12-13 hrs", "0-1 hrs", "10-11 hrs", "14+ hrs", "8-9 hrs", "6-7 hrs", 
"6-7 hrs", "6-7 hrs", "10-11 hrs", "12-13 hrs", "10-11 hrs", 
"10-11 hrs", "2-3 hrs", "2-3 hrs", "10-11 hrs", "6-7 hrs", "6-7 hrs", 
"6-7 hrs", "4-5 hrs", "6-7 hrs", "10-11 hrs", "6-7 hrs", "10-11 hrs", 
"4-5 hrs", "6-7 hrs", "10-11 hrs", "14+ hrs"), Bins = structure(c(3L, 
2L, 2L, 6L, 5L, 3L, 2L, 6L, 1L, 2L, 6L, 1L, 1L, 1L, 3L, 2L, 2L, 
3L, 2L, 5L, 2L, 1L, 6L, 3L, 4L, 6L, 6L, 4L, 4L, 4L, 5L, 3L, 5L, 
3L, 1L, 1L, 4L, 4L, 3L, 1L, 3L, 2L, 2L, 5L, 1L, 3L, 6L, 4L, 4L, 
5L, 1L, 4L, 2L, 3L, 3L, 2L, 4L, 3L, 3L, 4L, 6L, 1L, 6L, 3L, 4L, 
2L, 4L, 1L, 2L, 3L, 6L, 4L, 5L, 4L, 4L, 6L, 2L, 2L, 4L, 2L, 1L, 
5L, 5L, 3L, 2L, 4L, 4L, 4L, 1L, 5L, 4L, 4L, 2L, 1L, 2L, 2L, 1L, 
1L, 4L, 3L, 1L, 6L, 4L, 3L, 3L, 5L, 1L, 2L, 4L, 2L, 1L, 1L, 4L, 
1L, 1L, 3L, 3L, 6L, 1L, 1L, 6L, 2L, 6L, 6L, 4L, 1L, 4L, 4L, 5L, 
2L, 3L, 3L, 4L, 1L, 1L, 2L, 1L, 1L, 5L, 1L, 3L, 4L, 1L, 3L, 5L, 
3L, 4L, 5L, 1L, 4L, 3L, 6L, 1L, 6L, 5L, 1L, 2L, 1L, 3L, 1L, 6L, 
3L, 3L, 1L, 2L, 1L, 1L, 3L, 3L, 1L, 5L, 6L, 3L, 2L, 1L, 5L, 1L, 
1L, 1L, 2L, 3L, 5L, 6L, 1L, 3L, 2L, 3L, 2L, 1L, 2L, 5L, 2L, 4L, 
1L, 1L, 1L, 5L, 3L, 1L, 5L, 5L, 1L, 2L, 5L, 1L, 1L, 5L, 2L, 1L, 
2L, 5L, 3L, 1L, 4L, 3L, 5L, 1L, 3L, 5L, 2L, 4L, 5L, 4L, 2L, 3L, 
5L, 3L, 1L, 5L, 2L, 4L, 2L, 4L, 3L, 1L, 1L, 1L, 2L, 3L, 4L, 1L, 
3L, 5L, 4L, 4L, 4L, 2L, 4L, 3L, 3L, 3L, 2L, 4L, 1L, 6L, 1L, 5L, 
3L, 5L, 5L, 4L, 4L, 5L, 2L, 2L, 4L, 4L, 4L, 2L, 6L, 4L, 3L, 1L, 
5L, 1L, 1L, 3L, 1L, 4L, 5L, 5L, 5L, 4L, 2L, 4L, 3L, 1L, 2L, 4L, 
1L, 3L, 1L, 1L, 2L), .Label = c("50% or less", "51-60%", "61-70%", 
"71-80%", "81-90%", "91-100%"), class = "factor")), class = "data.frame", row.names = c(NA, 
-294L), .Names = c("ID", "Exam", "Hours", "Bins"))

Upvotes: 0

Views: 222

Answers (2)

jay.sf
jay.sf

Reputation: 73802

You can group the data frame and count the students.

P2 <- with(P1, setNames(aggregate(ID, list(Hours, Bins, Exam), length),
                        c("Hours", "Bins", "Exam", "Count")))

Or alternatively using dplyr:

library(dplyr)
P2 <- P1 %>%
  group_by(Hours, Bins, Exam) %>%
  summarise(Count=n())

In the plot then put 'Count' onto the y axis and fill with 'Bins'.

library(ggplot2)
ggplot(P2, aes (x=Hours,y=Count)) +
  geom_bar(aes(fill= Bins), stat="identity", position= "dodge") +
  facet_grid(Exam~.) +
  scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs",
                            "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) + 
  scale_y_discrete(limits = seq(2, 10, 2)) +
  labs (x= "Hours Spent Studying", y="# of Students")

This would yield this:

enter image description here

Eventually this is what you want.

You could also consider a stacked bar plot, which could be slightly more synoptic.

blevels <- levels(P2$Bins)  # save levels for labeling below 

library(ggplot2)
ggplot(P2, aes (x=Hours,y=Count)) +
  geom_bar(aes(fill= as.numeric(Bins)), stat="identity", position= "stack") +
  facet_grid(Exam~.) +
  scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs",
                            "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) + 
  labs(x= "Hours Spent Studying", y="# of Students", fill='Bins') +
  scale_fill_continuous(labels=blevels)

enter image description here

Upvotes: 1

Nick Holt
Nick Holt

Reputation: 116

This should do the trick. Notice that counts are calculated by using group_by() %>% summarize() before being piped into ggplot().

P1 %>% group_by(Exam, Hours, Bins) %>%
    summarize(Count = n()) %>%
    ggplot(aes(x=Hours,y=Count)) +
            geom_bar(aes(fill = Bins),stat="identity", position= "dodge")+
            facet_grid(Exam ~ .) +
            scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs", "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) + 
            labs (x= "Hours Spent Studying", y="# of Students")

Sorry I'm not cool enough to post pictures on SO :)

Upvotes: 1

Related Questions