Reputation: 575
I am currently trying to develop a convenience function, which is supposed to create for each column in a pandas dataframe a basic plot with the values and their amount in the dataset for all columns in the dataframe.
def plot_value_counts(df, leave_out):
# is supposed to create the subplots grid where I can add the plots
fig, axs = plt.subplots(int(len(df)/2) + 1,int(len(df)/2) + 1)
for idx, name in enumerate(list(df)):
if name == leave_out:
continue
else:
axs[idx] = df[name].value_counts().plot(kind="bar")
return fig, axs
this snippet runs for ever and never stops. I tried looking at other similar questions on stackoverflow, but couldn't find anything specific for my case.
the usage of the subplots function came from the following question: Is it possible to automatically generate multiple subplots in matplotlib?
below a short sample of the data file, so that everybody can understand the problem: https://gist.github.com/hentschelpatrick/e0a7e1400a4b5c356ec8b0e4952f8cc1#file-train-csv
Upvotes: 0
Views: 526
Reputation: 11
Here is a function that i had written for my project to plot all columns in a pandas dataframe. It will generate a grid of size nx4 and will plot all the columns
def plotAllFeatures(dfData):
plt.figure(1, figsize=(20,50))
pos=1
for feature in dfData.columns:
plt.subplot(np.ceil(len(dfData.columns)/4),4,pos)
dfData[feature].plot(title=feature)
pos=pos+1
plt.show()
Upvotes: 1
Reputation: 4343
You can pass the axis
object in the plot method docs. And you should iterate on columns:
fig, axs = plt.subplots(int(len(df)/2) + 1,int(len(df)/2) + 1)
for idx, name in enumerate(df.columns):
if name == leave_out:
continue
else:
df[name].value_counts().plot(kind="bar", ax=axs[idx])
EDIT: If you have memory issues (doens't seem to run) try first without using subplots and show
each plot:
for idx, name in enumerate(df.columns):
if name == leave_out:
continue
else:
df[name].value_counts().plot(kind="bar")
plt.show()
Upvotes: 1