Reputation: 302
I have an array that has different values, some of which are duplicates. How can I draw a histogram for them whose horizontal axis is the name of the element and the vertical axis is the number in the array?
arr= ['a','a','a','b','c','b']
Upvotes: 1
Views: 3032
Reputation: 29407
Histograms are used for numerical data. Your strings data (categorical) it would be better to use a bar chart. Here's the code to generate a bar chart with matplotlib
:
import matplotlib.pyplot as plt
from collections import Counter
arr= ['a', 'a', 'a', 'b', 'c', 'b']
data = Counter(arr)
plt.bar(data.keys(), data.values())
plt.show()
Even though histogram and bar chart in this case look similar, with an histogram you could have an unexpected result for instance if requiring a certain number of bins.
Upvotes: 0
Reputation: 41437
Note that matplotlib's hist
does not play nicely with string data (see the bar/tick positions):
import matplotlib.pyplot as plt
plt.hist(arr)
It's certainly possible to fix this manually, but it's easier to use pandas or seaborn. Both use matplotlib under the hood, but they provide better default formatting.
Also:
figsize
. In these examples I've set figsize=(6, 3)
.x
ticks, add plt.xticks(rotation=90)
.pandas value_counts
and plot.bar
import pandas as pd
pd.value_counts(arr).plot.bar(figsize=(6, 3))
# pd.Series(arr).value_counts().plot.bar(figsize=(6, 3))
seaborn histplot
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(6, 3))
sns.histplot(arr, ax=ax)
seaborn countplot
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(6, 3))
sns.countplot(arr, ax=ax)
collections.Counter
with matplotlib bar
from collections import Counter
counts = Counter(arr)
fig, ax = plt.subplots(figsize=(6, 3))
ax.bar(counts.keys(), counts.values())
numpy unique
with matplotlib bar
import numpy as np
uniques, counts = np.unique(arr, return_counts=True)
fig, ax = plt.subplots(figsize=(6, 3))
ax.bar(uniques, counts)
Upvotes: 1
Reputation:
There are multiple steps to this problem..
Step 1: You need to collect the data in a convenient location. Going by your example, a good option would be to make a list with values. This could use a .count() to accomplish that. Other methods are possible, of course.
Step 2: To display the data, you could use a lib like matplotlib.pyplot. This may also take care of step 1. But that is not importent.
If your use case is defferent. Please provide more details so that we can help you better
Upvotes: 0
Reputation: 535
You can use the matplotlib
library to plot a histogram directly from a list. The code for it goes as follows:
from matplotlib import pyplot as plt
arr= ['a','a','a','b','c','b']
plt.hist(arr)
plt.show()
You can check out more about the histogram function from matplotlib out here: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html
You can do other stuff like setting the color for your histogram plot, changing the alignment, and many other things.
Cheers!
Upvotes: 1