Reputation: 2556
I have a table like:
value type
10 0
12 1
13 1
14 2
Generate a dummy data:
import numpy as np
value = np.random.randint(1, 20, 10)
type = np.random.choice([0, 1, 2], 10)
I want to accomplish a task in Python 3 with matplotlib (v1.4):
value
type
, i.e. use different colors to differentiate typesidentity
for bins, i.e. the width of a bin is 1The questions are:
type
and draw colors from colormap (e.g. Accent
or other cmap in matplotlib)? I don't want to use named color (i.e. 'b', 'k', 'r'
)Note
pandas.plot
for two hours and failed to get the desired histogram.matplotlib.pyplot
, without import a bunch of modules such as matplotlib.cm
, matplotlib.colors
.Upvotes: 7
Views: 29062
Reputation: 3660
Whenever you need to plot a variable grouped by another (using color), seaborn usually provides a more convenient way to do that than matplotlib or pandas. So here is a solution using the seaborn histplot
function:
import numpy as np # v 1.19.2
import pandas as pd # v 1.1.3
import matplotlib.pyplot as plt # v 3.3.2
import seaborn as sns # v 0.11.0
# Set parameters for random data
rng = np.random.default_rng(seed=1) # random number generator
size = 50
xmin = 1
xmax = 20
# Create random dataframe
df = pd.DataFrame(dict(value = rng.integers(xmin, xmax, size=size),
val_type = rng.choice([0, 1, 2], size=size)))
# Create histogram with discrete bins (bin width is 1), colored by type
fig, ax = plt.subplots(figsize=(10,4))
sns.histplot(data=df, x='value', hue='val_type', multiple='dodge', discrete=True,
edgecolor='white', palette=plt.cm.Accent, alpha=1)
# Create x ticks covering the range of all integer values of df['value']
ax.set_xticks(np.arange(df['value'].min(), df['value'].max()+1))
# Additional formatting
sns.despine()
ax.get_legend().set_frame_on(False)
plt.show()
As you can notice, this being a histogram and not a bar plot, there is no space between the bars except where values of the x axis are not present in the dataset, like for values 12 and 14.
Seeing as the accepted answer provided a bar plot in pandas and that a bar plot may be a relevant choice for displaying a histogram in certain situations, here is how to create one with seaborn using the countplot
function:
# For some reason the palette argument in countplot is not processed the
# same way as in histplot so here I fetch the colors from the previous
# example to make it easier to compare them
colors = [c for c in set([patch.get_facecolor() for patch in ax.patches])]
# Create bar chart of counts of each value grouped by type
fig, ax = plt.subplots(figsize=(10,4))
sns.countplot(data=df, x='value', hue='val_type', palette=colors,
saturation=1, edgecolor='white')
# Additional formatting
sns.despine()
ax.get_legend().set_frame_on(False)
plt.show()
As this is a bar plot, the values 12 and 14 are not included which produces a somewhat deceitful plot as no empty space is shown for those values. On the other hand, there is some space between each group of bars which makes it easier to see what value each bar belongs to.
Upvotes: 0
Reputation: 36555
For your first question, we can create a dummy column equal to 1, and then generate counts by summing this column, grouped by value and type.
For your second question you can pass the colormap directly into plot
using the colormap
parameter:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import seaborn
seaborn.set() #make the plots look pretty
df = pd.DataFrame({'value': value, 'type': type})
df['dummy'] = 1
ag = df.groupby(['value','type']).sum().unstack()
ag.columns = ag.columns.droplevel()
ag.plot(kind = 'bar', colormap = cm.Accent, width = 1)
plt.show()
Upvotes: 9