Martien van den Broek
Martien van den Broek

Reputation: 19

Matplot data visualization - height argument must be scalar

I've been trying my hand at some data visualization with Python and Matplot. In this case I'm trying to visualize the amount of data missing per column. I ran a short script to find all the missing values per column and the result in the array missing_count. I now would like to show this in a bar chart using Matplot but I've run into this issue:

import matplotlib.pyplot as plt
import numpy as np

missing_count = np.array([33597, 0, 0, 0, 0, 0, 0, 12349, 0, 0, 12349, 0, 0, 0, 115946, 47696, 44069, 81604, 5416, 5416, 5416, 5416, 0, 73641, 74331, 187204, 128829, 184118, 116441, 183093, 153048, 187349, 89918, 89918, 89918, 89918, 89918, 89918, 51096, 51096, 51096, 51096, 51096, 51096, 51096, 51096, 51096, 51096])

n = len(missing_count)
index = np.arange(n)

fig, ax = plt.subplots()

r1 = ax.bar(index, n, 0.15, missing_count, color='r')

ax.set_ylabel('NULL values')
ax.set_title('Amount of NULL values per colum')
ax.set_xticks(index + width / 2)
ax.set_xticklabels(list(originalData.columns.values))

plt.show()

Resulting in this error:

ValueError                                Traceback (most recent call last)
<ipython-input-34-285ca1e9de68> in <module>()
     10 fig, ax = plt.subplots()
     11 
---> 12 r1 = ax.bar(index, n, 0.15, missing_count, color='r')
     13 
     14 ax.set_ylabel('NULL values')

C:\Users\Martien\Anaconda3\lib\site-packages\matplotlib\__init__.py in inner(ax, *args, **kwargs)
   1895                     warnings.warn(msg % (label_namer, func.__name__),
   1896                                   RuntimeWarning, stacklevel=2)
-> 1897             return func(ax, *args, **kwargs)
   1898         pre_doc = inner.__doc__
   1899         if pre_doc is None:

C:\Users\Martien\Anaconda3\lib\site-packages\matplotlib\axes\_axes.py in bar(self, left, height, width, bottom, **kwargs)
   2077         if len(height) != nbars:
   2078             raise ValueError("incompatible sizes: argument 'height' "
-> 2079                               "must be length %d or scalar" % nbars)
   2080         if len(width) != nbars:
   2081             raise ValueError("incompatible sizes: argument 'width' "

ValueError: incompatible sizes: argument 'height' must be length 48 or scalar

I've looked at a the Matplot documentation which tells me that height should be a scalar, but it does not reference or explain what this scalar is. There is also this example I've followed which does work when I run it.

I've run out of ideas as to why I get this error, all help would really be appreciated.

Edit: originalData is the original CSV file I read in, I only use it here to name my bars

Upvotes: 0

Views: 409

Answers (2)

Mohammad Athar
Mohammad Athar

Reputation: 1980

so, according to https://matplotlib.org/devdocs/api/_as_gen/matplotlib.pyplot.bar.html

the second argument must be height

you're inputting n as the second argument which is a single number

try

r1 = ax.bar(index, missing_count, 0.15, color='r')

instead, which should get the job done.

Even better, be explicit about your argument names (tedious, and harder to keep clean, but a good ides when you have more than a few arguments)

r1 = ax.bar(x=index, height = missing_count, width = 0.15, color='r')

the second argument must be height; height corresponds to the count for any particular box. Say you had an array of zeros and ones

A = [0,0,0,0,1,1,1]

that would result in a bar plot with two bars, one would be 4 units high (since you have four zeros) the other would be 3 units high

the command

r1 = ax.bar([0,1], [4,3], 0.15, color='r')

would make a plot with a bar at zero and a bar at 1. The first bar would be 4 units high, the second would be 3 units high.

Translating to your code, missing_count corresponds to the COUNT of the array that's not A, but instead [Counter([0,0,0,0,1,1,1])[x] for x in Counter([0,0,0,0,1,1,1])]

Upvotes: 2

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339230

In the code n is scalar. You probably do not want the bar height to be constant, but rather the values from missing_count.

ax.bar(index, missing_count, 0.15, color='r')

Upvotes: 1

Related Questions