Reputation: 855
I have a dataset which has a category field, 'City' and 2 metrics, Age and Weight. I want to plot a scatterplot for each City using a loop. However I'm struggling to combine the group by and loop that I need in a single statement. If I just use a for loop I end up with a chart for each record and if I do a group by I get the right number of charts but with no values.
Here is my code using just the for loop with my group by commented out:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
d = { 'City': pd.Series(['London','New York', 'New York', 'London', 'Paris',
'Paris','New York', 'New York', 'London','Paris']),
'Age' : pd.Series([36., 42., 6., 66., 38.,18.,22.,43.,34.,54]),
'Weight': pd.Series([225,454,345,355,234,198,400, 256,323,310])
}
df = pd.DataFrame(d)
#for C in df.groupby('City'):
for C in df.City:
fig = plt.figure(figsize=(5, 4))
# Create an Axes object.
ax = fig.add_subplot(1,1,1) # one row, one column, first plot
# Plot the data.
ax.scatter(df.Age,df.Weight, df.City == C, color="red", marker="^")
Upvotes: 3
Views: 8857
Reputation: 879351
Do not call plt.figure
more than once, as each call creates a new figure (roughly speaking, window).
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
d = {'City': ['London', 'New York', 'New York', 'London', 'Paris',
'Paris', 'New York', 'New York', 'London', 'Paris'],
'Age': [36., 42., 6., 66., 38., 18., 22., 43., 34., 54],
'Weight': [225, 454, 345, 355, 234, 198, 400, 256, 323, 310]}
df = pd.DataFrame(d)
fig, ax = plt.subplots(figsize=(5, 4)) # 1
df.groupby(['City']).plot(kind='scatter', x='Age', y='Weight',
ax=ax, # 2
color=['red', 'blue', 'green'])
plt.show()
plt.subplots
returns a figure, fig
and an axes, ax
.ax=ax
to Panda's plot method, then all the plots will
show up on the same axis.To make a separate figure for each city:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
d = {'City': ['London', 'New York', 'New York', 'London', 'Paris',
'Paris', 'New York', 'New York', 'London', 'Paris'],
'Age': [36., 42., 6., 66., 38., 18., 22., 43., 34., 54],
'Weight': [225, 454, 345, 355, 234, 198, 400, 256, 323, 310]}
df = pd.DataFrame(d)
groups = df.groupby(['City'])
for city, grp in groups: # 1
fig, ax = plt.subplots(figsize=(5, 4))
grp.plot(kind='scatter', x='Age', y='Weight', # 2
ax=ax)
plt.show()
grp
, the sub-DataFrame instead of df
inside the for-loop.Upvotes: 2
Reputation: 855
I've used the group by from the other post and inserted into my code to generate a chart for each group by:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
d = { 'City': pd.Series(['London','New York', 'New York', 'London','Paris',
'Paris','New York', 'New York', 'London','Paris']),
'Age' : pd.Series([36., 42., 6., 66., 38.,18.,22.,43.,34.,54]) ,
'Weight': pd.Series([225,454,345,355,234,198,400, 256,323,310])
}
df = pd.DataFrame(d)
groups = df.groupby(['City'])
for city, grp in groups:
fig = plt.figure(figsize=(5, 4))
# Create an Axes object.
ax = fig.add_subplot(1,1,1) # one row, one column, first plot
# Plot the data.
ax.scatter(df.Age,df.Weight, df.City == city, color="red", marker="^")
Upvotes: 2