Reputation: 133
I have this DataFrame
StudentID Name Assignment1 Assignment2 Assignment3.. Assignment'n'
0 s1 user1 7 7 -3
1 s2 user2 2 10 10
2 s3 user3 12 10 10
3 s4 user4 4 2 10
4 s5 user5 -3 7 2
And I need to scatter plot the Assignment1,...,Assignment'n'
values.
On the plot the x axis = Assignment1,...,Assignment'n'
and y axis=[-3,0,2,4,7,10,12]
, which are the values on Assignment's columns.
I'm pretty lost, so I would like to know if anyone has a hint on how to solve this?
Upvotes: 2
Views: 4834
Reputation: 8703
First, you are looking for a line chart, not a scatter plot. Scatter plot is used when you are plotting two similar variables against each other. For example if you want to plot Assignment1
against Assignment2
. That will give you an idea about student performances across both assignments. This is useful if you want to do stuff like graphical regression, etc.
Second, Pandas is an overkill for a table the size of a class. In fact, I would not have used Python for this at all. R would be a better option, because you can use simpler types (like arrays) and have names to rows and columns. Also, plotting functions are directly accessible. But, since you have already started working on Pandas...
So, you need to import a few things:
import matplotlib.pyplot as plt
import matplotlib
matplotlib.styles.use('ggplot') # much better plot styles.
# There are others available, look them up if you want.
Now, you'll create a figure:
plt.figure()
On which you can plot your data. Since you would like to plot data from all columns, column 3 onwards, we will simply use .iloc
df.iloc[:,2:].plot()
You can now set axis limits, axis labels, modify tick marks, etc. I'll let you figure all of that out yourself.
You will finally need to actually draw your plot:
plt.show()
Upvotes: 2