Luca
Luca

Reputation: 35

Have each histogram bin with a different color

I plotted a histogram, and would like to have each of the bins to have a different color. Right now I get the error message: "The 'color' keyword argument must have one color per dataset, but 1 datasets and 10 colors were provided"

I am attaching a screenshot of the histogram as well. Thanks in advance histogram

decades = np.arange(1910, 2020, 10)
colors = ['aqua', 'red', 'gold', 'royalblue', 'darkorange', 'green', 'purple', 'cyan', 'yellow', 'lime']

plt.figure(figsize=(12,7))
plt.hist(df.Year, bins=decades, color=colors)
plt.xticks(decades);

Upvotes: 1

Views: 8790

Answers (2)

Trenton McKinney
Trenton McKinney

Reputation: 62373

  • The following, answers the data in the OP, not the title.
  • A histogram is best used for continuous data (e.g. floats). This data is years by decade, so it's discrete, which means this is just bar plot of value counts.
  • As per the OP, the data is in a pandas dataframe (df.Year), so get the .value_counts of 'Year', and then plot with pandas.DataFrame.plot and kind='bar', which uses matplotlib as the backend. This also has color as a parameter.
  • Tested in python 3.8.11, pandas 1.3.2, matplotlib 3.4.2, seaborn 0.11.2
import pandas as pd
import numpy as np

# sample data
np.random.seed(365)
data = {'Year': np.random.choice(np.arange(1910, 2020, 10), size=1100)}
df = pd.DataFrame(data)

# display(df.head())
   Year
0  1930
1  1950
2  1920
3  1960
4  1930

# get the value counts and sort
vc = df.Year.value_counts().sort_index()

# plot
colors = ['aqua', 'red', 'gold', 'royalblue', 'darkorange', 'green', 'purple', 'steelblue', 'yellow', 'lime', 'magenta']
vc.plot(kind='bar', color=colors, width=1, rot=0, ec='k')

enter image description here

sns.countplot

  • seaborn is a high-level API for matplotlib
  • With .countplot there's no need to use .value_counts()
p = sns.countplot(data=df, x='Year', palette=colors)

enter image description here

Upvotes: 4

Rutger Kassies
Rutger Kassies

Reputation: 64443

The colors keyword is only for the case where you want to plot multiple datasets (=histograms) at once. It can't be used to color the bars individually.

You can however capture the result from the hist command, and iterate over the result to set the color. This allows you to also use the value or bin information if you need to (to color based on the value for example), but using your example for simply assigning a unique color (based on the order) can be done with.

For example:

import matplotlib.pyplot as plt
import numpy as np

decades = np.arange(1910, 2020, 10)
data = np.random.gamma(4, scale=0.2, size=1000)*110+1910
colors = ['aqua', 'red', 'gold', 'royalblue', 'darkorange', 'green', 'purple', 'cyan', 'yellow', 'lime']

fig, ax = plt.subplots(figsize=(8,4), facecolor='w')
cnts, values, bars = ax.hist(data, edgecolor='k', bins=decades)
ax.set_xticks(decades)

for i, (cnt, value, bar) in enumerate(zip(cnts, values, bars)):
    bar.set_facecolor(colors[i % len(colors)])

enter image description here

Or color based on the value:

cmap = plt.cm.viridis

for i, (cnt, value, bar) in enumerate(zip(cnts, values, bars)):
    bar.set_facecolor(cmap(cnt/cnts.max()))

enter image description here

Upvotes: 5

Related Questions