Reputation: 131
I am working on a grouped barchart that will hopefully show the range of final grades of students by their number of hours of studying over three different exams.
Here is what I've managed to build so far. Ignore the y axis and the actual bars as they are incorrect. What I want is for the y axis to be the number of students who have gotten a given grade with a given number of study hours.
And here is the code I used:
ggplot(P1, aes (x=Hours,y=Bins)) +
geom_bar(aes(fill= Bins),stat="identity", position= "dodge")+
facet_grid(Exam~.)+
scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs", "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) +
labs (x= "Hours Spent Studying", y="# of Students")
I'm pretty sure the problem is with my data input, which looks like this:
ID Exam Hours Bins
1 S001 1 0-1 hrs 61-70%
2 S002 1 4-5 hrs 51-60%
3 S003 1 12-13 hrs 51-60%
4 S004 1 6-7 hrs 91-100%
5 S005 1 6-7 hrs 81-90%
6 S006 1 12-13 hrs 61-70%
I think I need to add a "Count" column to act as the y axis. However, I'm confused as I didn't need to do this when making a normal non-grouped bar chart. Is this the solution? And if so, how do I add a count column?
I am a beginner at R.
Here is a dput of my dataframe:
> dput(P1)
structure(list(ID = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L,
22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L,
35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L,
48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L,
61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L,
74L, 75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L,
87L, 88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L,
29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L,
42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L,
55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L,
68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 80L,
81L, 82L, 83L, 84L, 85L, 86L, 87L, 88L, 89L, 90L, 91L, 92L, 93L,
94L, 95L, 96L, 97L, 98L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L,
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L,
36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L,
49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L,
62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L,
75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L,
88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L), .Label = c("S001",
"S002", "S003", "S004", "S005", "S006", "S007", "S008", "S009",
"S010", "S011", "S012", "S013", "S014", "S015", "S016", "S017",
"S018", "S019", "S020", "S021", "S022", "S023", "S024", "S025",
"S026", "S027", "S028", "S029", "S030", "S031", "S032", "S033",
"S034", "S035", "S036", "S037", "S038", "S039", "S040", "S041",
"S042", "S043", "S044", "S045", "S046", "S047", "S048", "S049",
"S050", "S051", "S052", "S053", "S054", "S055", "S056", "S057",
"S058", "S059", "S060", "S061", "S062", "S063", "S064", "S065",
"S066", "S067", "S068", "S069", "S070", "S071", "S072", "S073",
"S074", "S075", "S076", "S077", "S078", "S079", "S080", "S081",
"S082", "S083", "S084", "S085", "S086", "S087", "S088", "S089",
"S090", "S091", "S092", "S093", "S094", "S095", "S096", "S097",
"S098"), class = "factor"), Exam = c("1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "3",
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3",
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3",
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3",
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3",
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3",
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3",
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3",
"3", "3", "3"), Hours = c("0-1 hrs", "4-5 hrs", "12-13 hrs",
"6-7 hrs", "6-7 hrs", "12-13 hrs", "14+ hrs", "6-7 hrs", "2-3 hrs",
"6-7 hrs", "4-5 hrs", "8-9 hrs", "6-7 hrs", "12-13 hrs", "2-3 hrs",
"12-13 hrs", "2-3 hrs", "4-5 hrs", "10-11 hrs", "0-1 hrs", "4-5 hrs",
"10-11 hrs", "4-5 hrs", "8-9 hrs", "0-1 hrs", "12-13 hrs", "2-3 hrs",
"6-7 hrs", "6-7 hrs", "10-11 hrs", "10-11 hrs", "6-7 hrs", "10-11 hrs",
"12-13 hrs", "6-7 hrs", "6-7 hrs", "14+ hrs", "2-3 hrs", "4-5 hrs",
"6-7 hrs", "4-5 hrs", "4-5 hrs", "8-9 hrs", "8-9 hrs", "2-3 hrs",
"14+ hrs", "2-3 hrs", "2-3 hrs", "8-9 hrs", "8-9 hrs", "6-7 hrs",
"14+ hrs", "8-9 hrs", "10-11 hrs", "10-11 hrs", "8-9 hrs", "2-3 hrs",
"8-9 hrs", "8-9 hrs", "4-5 hrs", "2-3 hrs", "4-5 hrs", "2-3 hrs",
"4-5 hrs", "4-5 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs",
"4-5 hrs", "8-9 hrs", "14+ hrs", "0-1 hrs", "4-5 hrs", "10-11 hrs",
"4-5 hrs", "4-5 hrs", "8-9 hrs", "4-5 hrs", "12-13 hrs", "4-5 hrs",
"6-7 hrs", "8-9 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", "6-7 hrs",
"4-5 hrs", "10-11 hrs", "4-5 hrs", "4-5 hrs", "10-11 hrs", "12-13 hrs",
"4-5 hrs", "6-7 hrs", "4-5 hrs", "4-5 hrs", "12-13 hrs", "2-3 hrs",
"6-7 hrs", "10-11 hrs", "6-7 hrs", "8-9 hrs", "6-7 hrs", "8-9 hrs",
"2-3 hrs", "4-5 hrs", "8-9 hrs", "4-5 hrs", "4-5 hrs", "6-7 hrs",
"6-7 hrs", "6-7 hrs", "14+ hrs", "4-5 hrs", "10-11 hrs", "6-7 hrs",
"0-1 hrs", "4-5 hrs", "12-13 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs",
"12-13 hrs", "2-3 hrs", "4-5 hrs", "10-11 hrs", "8-9 hrs", "4-5 hrs",
"8-9 hrs", "8-9 hrs", "14+ hrs", "8-9 hrs", "8-9 hrs", "4-5 hrs",
"0-1 hrs", "4-5 hrs", "6-7 hrs", "4-5 hrs", "6-7 hrs", "8-9 hrs",
"8-9 hrs", "2-3 hrs", "12-13 hrs", "4-5 hrs", "2-3 hrs", "6-7 hrs",
"10-11 hrs", "6-7 hrs", "12-13 hrs", "8-9 hrs", "8-9 hrs", "10-11 hrs",
"8-9 hrs", "2-3 hrs", "8-9 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs",
"4-5 hrs", "2-3 hrs", "4-5 hrs", "8-9 hrs", "6-7 hrs", "4-5 hrs",
"8-9 hrs", "8-9 hrs", "2-3 hrs", "10-11 hrs", "10-11 hrs", "0-1 hrs",
"2-3 hrs", "6-7 hrs", "6-7 hrs", "4-5 hrs", "8-9 hrs", "2-3 hrs",
"12-13 hrs", "0-1 hrs", "4-5 hrs", "6-7 hrs", "2-3 hrs", "2-3 hrs",
"8-9 hrs", "6-7 hrs", "2-3 hrs", "2-3 hrs", "2-3 hrs", "4-5 hrs",
"8-9 hrs", "6-7 hrs", "10-11 hrs", "8-9 hrs", "4-5 hrs", "6-7 hrs",
"12-13 hrs", "2-3 hrs", "4-5 hrs", "14+ hrs", "6-7 hrs", "8-9 hrs",
"6-7 hrs", "14+ hrs", "10-11 hrs", "4-5 hrs", "4-5 hrs", "6-7 hrs",
"6-7 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", "14+ hrs", "6-7 hrs",
"6-7 hrs", "2-3 hrs", "2-3 hrs", "2-3 hrs", "14+ hrs", "4-5 hrs",
"6-7 hrs", "0-1 hrs", "12-13 hrs", "2-3 hrs", "10-11 hrs", "10-11 hrs",
"4-5 hrs", "6-7 hrs", "6-7 hrs", "10-11 hrs", "14+ hrs", "10-11 hrs",
"6-7 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", "14+ hrs", "4-5 hrs",
"14+ hrs", "4-5 hrs", "10-11 hrs", "4-5 hrs", "8-9 hrs", "6-7 hrs",
"4-5 hrs", "10-11 hrs", "6-7 hrs", "12-13 hrs", "14+ hrs", "6-7 hrs",
"6-7 hrs", "10-11 hrs", "10-11 hrs", "4-5 hrs", "6-7 hrs", "10-11 hrs",
"4-5 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs",
"4-5 hrs", "6-7 hrs", "4-5 hrs", "6-7 hrs", "14+ hrs", "6-7 hrs",
"12-13 hrs", "0-1 hrs", "10-11 hrs", "14+ hrs", "8-9 hrs", "6-7 hrs",
"6-7 hrs", "6-7 hrs", "10-11 hrs", "12-13 hrs", "10-11 hrs",
"10-11 hrs", "2-3 hrs", "2-3 hrs", "10-11 hrs", "6-7 hrs", "6-7 hrs",
"6-7 hrs", "4-5 hrs", "6-7 hrs", "10-11 hrs", "6-7 hrs", "10-11 hrs",
"4-5 hrs", "6-7 hrs", "10-11 hrs", "14+ hrs"), Bins = structure(c(3L,
2L, 2L, 6L, 5L, 3L, 2L, 6L, 1L, 2L, 6L, 1L, 1L, 1L, 3L, 2L, 2L,
3L, 2L, 5L, 2L, 1L, 6L, 3L, 4L, 6L, 6L, 4L, 4L, 4L, 5L, 3L, 5L,
3L, 1L, 1L, 4L, 4L, 3L, 1L, 3L, 2L, 2L, 5L, 1L, 3L, 6L, 4L, 4L,
5L, 1L, 4L, 2L, 3L, 3L, 2L, 4L, 3L, 3L, 4L, 6L, 1L, 6L, 3L, 4L,
2L, 4L, 1L, 2L, 3L, 6L, 4L, 5L, 4L, 4L, 6L, 2L, 2L, 4L, 2L, 1L,
5L, 5L, 3L, 2L, 4L, 4L, 4L, 1L, 5L, 4L, 4L, 2L, 1L, 2L, 2L, 1L,
1L, 4L, 3L, 1L, 6L, 4L, 3L, 3L, 5L, 1L, 2L, 4L, 2L, 1L, 1L, 4L,
1L, 1L, 3L, 3L, 6L, 1L, 1L, 6L, 2L, 6L, 6L, 4L, 1L, 4L, 4L, 5L,
2L, 3L, 3L, 4L, 1L, 1L, 2L, 1L, 1L, 5L, 1L, 3L, 4L, 1L, 3L, 5L,
3L, 4L, 5L, 1L, 4L, 3L, 6L, 1L, 6L, 5L, 1L, 2L, 1L, 3L, 1L, 6L,
3L, 3L, 1L, 2L, 1L, 1L, 3L, 3L, 1L, 5L, 6L, 3L, 2L, 1L, 5L, 1L,
1L, 1L, 2L, 3L, 5L, 6L, 1L, 3L, 2L, 3L, 2L, 1L, 2L, 5L, 2L, 4L,
1L, 1L, 1L, 5L, 3L, 1L, 5L, 5L, 1L, 2L, 5L, 1L, 1L, 5L, 2L, 1L,
2L, 5L, 3L, 1L, 4L, 3L, 5L, 1L, 3L, 5L, 2L, 4L, 5L, 4L, 2L, 3L,
5L, 3L, 1L, 5L, 2L, 4L, 2L, 4L, 3L, 1L, 1L, 1L, 2L, 3L, 4L, 1L,
3L, 5L, 4L, 4L, 4L, 2L, 4L, 3L, 3L, 3L, 2L, 4L, 1L, 6L, 1L, 5L,
3L, 5L, 5L, 4L, 4L, 5L, 2L, 2L, 4L, 4L, 4L, 2L, 6L, 4L, 3L, 1L,
5L, 1L, 1L, 3L, 1L, 4L, 5L, 5L, 5L, 4L, 2L, 4L, 3L, 1L, 2L, 4L,
1L, 3L, 1L, 1L, 2L), .Label = c("50% or less", "51-60%", "61-70%",
"71-80%", "81-90%", "91-100%"), class = "factor")), class = "data.frame", row.names = c(NA,
-294L), .Names = c("ID", "Exam", "Hours", "Bins"))
Upvotes: 0
Views: 222
Reputation: 73802
You can group the data frame and count the students.
P2 <- with(P1, setNames(aggregate(ID, list(Hours, Bins, Exam), length),
c("Hours", "Bins", "Exam", "Count")))
Or alternatively using dplyr
:
library(dplyr)
P2 <- P1 %>%
group_by(Hours, Bins, Exam) %>%
summarise(Count=n())
In the plot then put 'Count' onto the y axis and fill with 'Bins'.
library(ggplot2)
ggplot(P2, aes (x=Hours,y=Count)) +
geom_bar(aes(fill= Bins), stat="identity", position= "dodge") +
facet_grid(Exam~.) +
scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs",
"8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) +
scale_y_discrete(limits = seq(2, 10, 2)) +
labs (x= "Hours Spent Studying", y="# of Students")
This would yield this:
Eventually this is what you want.
You could also consider a stacked bar plot, which could be slightly more synoptic.
blevels <- levels(P2$Bins) # save levels for labeling below
library(ggplot2)
ggplot(P2, aes (x=Hours,y=Count)) +
geom_bar(aes(fill= as.numeric(Bins)), stat="identity", position= "stack") +
facet_grid(Exam~.) +
scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs",
"8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) +
labs(x= "Hours Spent Studying", y="# of Students", fill='Bins') +
scale_fill_continuous(labels=blevels)
Upvotes: 1
Reputation: 116
This should do the trick. Notice that counts are calculated by using group_by() %>% summarize() before being piped into ggplot().
P1 %>% group_by(Exam, Hours, Bins) %>%
summarize(Count = n()) %>%
ggplot(aes(x=Hours,y=Count)) +
geom_bar(aes(fill = Bins),stat="identity", position= "dodge")+
facet_grid(Exam ~ .) +
scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs", "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) +
labs (x= "Hours Spent Studying", y="# of Students")
Sorry I'm not cool enough to post pictures on SO :)
Upvotes: 1