Reputation: 23
I have a small dataframe with student_id, exam_1, exam_2, exam_3, exam_4, and exam_5 as columns. There are 5 students as well for the rows. What I'd like to do is plot a bar graph showing the exam grades of one student aka one specific row, and ultimately doing it for each or a specific student from user input.
For now, though, I'm stuck on how to plot a bar graph for just one specific student.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'student_id': [83838, 16373, 93538, 29383, 58585],
'exam_1': [80, 95, 90, 75, 50],
'exam_2': [60, 92, 88, 85, 40],
'exam_3': [70, 55, 75, 45, 60],
'exam_4': [55, 95, 45, 80, 55],
'exam_5': [91, 35, 92, 90, 75]})
print(df)
Which produces this as output:
student_id exam_1 exam_2 exam_3 exam_4 exam_5
0 83838 80 60 70 55 91
1 16373 95 92 55 95 35
2 93538 90 88 75 45 92
3 29383 75 85 45 80 90
4 58585 50 40 60 55 75
Adding this code below will allow me to select just one specific student ID aka row:
df = df.loc[df['student_id'] == 29383]
print(df)
student_id exam_1 exam_2 exam_3 exam_4 exam_5
3 29383 75 85 45 80 90
From here is where I'd like to plot this particular student's exams in a bar plot.
I tried the code below but it doesn't display it how I'd like. It seems that the index of this particular student is being used for the tick on the x-axis, if you can see the image. It will show '3' with some bar plots around it.
exam_plots_for_29383 = df.plot.bar()
plt.show()
Which will output this bar plot: Dataframe with bar plot. Looks weird.
I tried to transpose the dataframe, which kind of gets me to what I want. I used this code below:
df = df.T
exam_plots_for_29383_T = df.plot.bar()
plt.show()
But I end up with this as a graph: Transpose of dataframe with bar plot. Looks weird still.
I'm a bit stuck. I know there's a logical way of properly plotting a bar plot from the dataframe, I just can't for the life of me figure it out.
I'd like the bar plot to have:
I think the last two options are done automatically. It's just the first two that are breaking my brain. I appreciate any help or tips.
Here's the code in full in case anyone would like to see it without it being split like above.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'student_id': [83838, 16373, 93538, 29383, 58585],
'exam_1': [80, 95, 90, 75, 50],
'exam_2': [60, 92, 88, 85, 40],
'exam_3': [70, 55, 75, 45, 60],
'exam_4': [55, 95, 45, 80, 55],
'exam_5': [91, 35, 92, 90, 75]})
print(df)
df = df.loc[df['student_id'] == 29383]
print(df)
exam_plots_for_29383 = df.plot.bar()
plt.show()
df = df.T
exam_plots_for_29383_T = df.plot.bar()
plt.show()
Upvotes: 2
Views: 4407
Reputation: 59579
You are very close. The issue is that your numeric-like student ID is messing up all of the plots (which is why ID 29383 is giving you a bar close to 30,000 in all of your graphs).
Set the 'student_id' to the index so that it doesn't get plotted and now you can plot each student separately slicing the index with .loc[student_id]
, or if you plot the entire DataFrame it will color each different student.
df = df.set_index('student_id')
df.loc[29383].plot(kind='bar', figsize=(4,3), rot=30)
Knowing there are 5 exams you can give each its own color if you really want. Use a categorical color palette (tab10). (This also only works with Series.plot)
from matplotlib import cm
df.loc[29383].plot(kind='bar', figsize=(4,3), rot=30, color=cm.tab10.colors[0:5])
Upvotes: 2