Jake Bourne
Jake Bourne

Reputation: 763

seaborn heatmap pandas calculation on isnull

producing a series calculation of a dataframe to provide a percentage of NaN's to the total amount of rows as shown:

data = df.isnull().sum()/len(df)*100

RecordID          0.000000
ContactID         0.000000
EmailAddress      0.000000
ExternalID      100.000000
Date              0.000000
Name              0.000000
Owner            67.471362
Priority          0.000000
Status            0.000000
Subject           0.000000
Description       0.000000
Type              0.000000
dtype: float64

What I'm keen to do is represent this as a heatmap in seaborn sns.heatmap(data), drawing the readers attention those with 100 and 67%, unfortunately I'm getting this error

IndexError: Inconsistent shape between the condition and the input (got (12, 1) and (12,))

Full traceback:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-17-05db696a3a9b> in <module>()
----> 1 sns.heatmap(data)

~\AppData\Local\Programs\Python\Python36-32\lib\site-packages\seaborn\matrix.py in heatmap(data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, linewidths, linecolor, cbar, cbar_kws, cbar_ax, square, xticklabels, yticklabels, mask, ax, **kwargs)
    515     plotter = _HeatMapper(data, vmin, vmax, cmap, center, robust, annot, fmt,
    516                           annot_kws, cbar, cbar_kws, xticklabels,
--> 517                           yticklabels, mask)
    518 
    519     # Add the pcolormesh kwargs here

~\AppData\Local\Programs\Python\Python36-32\lib\site-packages\seaborn\matrix.py in __init__(self, data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, cbar, cbar_kws, xticklabels, yticklabels, mask)
    114         mask = _matrix_mask(data, mask)
    115 
--> 116         plot_data = np.ma.masked_where(np.asarray(mask), plot_data)
    117 
    118         # Get good names for the rows and columns

~\AppData\Local\Programs\Python\Python36-32\lib\site-packages\numpy\ma\core.py in masked_where(condition, a, copy)
   1934     if cshape and cshape != ashape:
   1935         raise IndexError("Inconsistent shape between the condition and the input"
-> 1936                          " (got %s and %s)" % (cshape, ashape))
   1937     if hasattr(a, '_mask'):
   1938         cond = mask_or(cond, a._mask)

IndexError: Inconsistent shape between the condition and the input (got (12, 1) and (12,))

My research if hitting a lot of walls around numpy broadcasting rules, or a bug from 3 years ago - none of which are super helpful.

Thanks as always.

Upvotes: 2

Views: 5208

Answers (1)

Jan K
Jan K

Reputation: 4150

Your data variable is an instance of pd.Series which is inherently 1D. However, sns.heatmap expects a 2D input. A quick fix is for example the following:

sns.heatmap(data.to_frame())

Upvotes: 8

Related Questions