user
user

Reputation: 79

Python error: generating a scatter plot using matplotlib

I am a python newbie suffering from how to import CSV file in matplotlib.pyplot I would like to see the relationship between hour (=how many hours people spent to play a video game) and level (=game level). and then I would like to draw a scatter plot with Tax in different colors between female(1) and male(0).So, my x would be 'hour' and my y would be 'level'.

my data csv file looks like this:

          hour gender level
0            8    1   20.00
1            9    1   24.95
2           12    0   10.67
3           12    0   18.00
4           12    0   17.50
5           13    0   13.07
6           10    0   14.45
...
...
499         12    1  19.47
500         16    0  13.28

Here's my code:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

df=pd.read_csv('data.csv')
plt.plot(x,y, lavel='some relationship')
plt.title("Some relationship")
plt.xlabel('hour')
plt.ylabel('level')
plt.plot[gender(gender=1), '-b', label=female]
plt.plot[gender(gender=0), 'gD', label=male]
plt.axs()
plt.show()

I would like to draw the following graph. So, there will be two lines of male and female.

y=level|           @----->male
       | @
       | *         *----->female
       |________________ x=hour

However, I am not sure how to solve this problem. I kept getting an error NameError: name 'hour' is not defined.

Upvotes: 0

Views: 1525

Answers (1)

erocoar
erocoar

Reputation: 5893

Could do it in this way:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

df = pd.DataFrame(data={"hour": [8,9,12,12,12,13,10], 
                        "gender": [1,1,0,0,0,0,0],
                        "level": [20, 24.95, 10.67, 18, 17.5, 13.07, 14.45]})

df.sort_values("hour", ascending=True, inplace=True)

fig = plt.figure(dpi=80)
ax = fig.add_subplot(111, aspect='equal')

ax.plot(df.hour[df.gender==1], df.level[df.gender==1], c="red", label="male")
ax.plot(df.hour[df.gender==0], df.level[df.gender==0], c="blue", label="female")
plt.xlabel('hour')
plt.ylabel('level')

Upvotes: 2

Related Questions