
Reputation: 7329

TypeError using sns.distplot() on dataframe with one row

I'm plotting subsets of a dataframe, and one subset happens to have only one row. This is the only reason I can think of for why it's causing problems. This is what it looks like:

problem_dataframe = prob_df[prob_df['Date']==7]

enter image description here

I try to do:


But I get the error:

TypeError: len() of unsized object

Would someone please tell me what's causing this and how to work around it?

Upvotes: 1

Views: 1516

Answers (1)


Reputation: 21274

The TypeError is resolved by setting bins=1.

But that uncovers a different error, ValueError: x must be 1D or 2D, which gets triggered by an internal function in Matplotlib's hist(), called _normalize_input():

import pandas as pd
import seaborn as sns
df = pd.DataFrame(['Tue','Feb',7,'15:37:58',2017,15.6196]).T
df.columns = ['Day','Month','Date','Time','Year','floatTime']
sns.distplot(df.floatTime, bins=1)


ValueError                                Traceback (most recent call last)
<ipython-input-25-858df405d200> in <module>()
      6 df.columns = ['Day','Month','Date','Time','Year','floatTime']
      7 df.floatTime.values.astype(float)
----> 8 sns.distplot(df.floatTime, bins=1)

/home/andrew/anaconda3/lib/python3.6/site-packages/seaborn/ in distplot(a, bins, hist, kde, rug, fit, hist_kws, kde_kws, rug_kws, fit_kws, color, vertical, norm_hist, axlabel, label, ax)
    213         hist_color = hist_kws.pop("color", color)
    214         ax.hist(a, bins, orientation=orientation,
--> 215                 color=hist_color, **hist_kws)
    216         if hist_color != color:
    217             hist_kws["color"] = hist_color

/home/andrew/anaconda3/lib/python3.6/site-packages/matplotlib/ in inner(ax, *args, **kwargs)
   1890                     warnings.warn(msg % (label_namer, func.__name__),
   1891                                   RuntimeWarning, stacklevel=2)
-> 1892             return func(ax, *args, **kwargs)
   1893         pre_doc = inner.__doc__
   1894         if pre_doc is None:

/home/andrew/anaconda3/lib/python3.6/site-packages/matplotlib/axes/ in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
   6141             x = np.array([[]])
   6142         else:
-> 6143             x = _normalize_input(x, 'x')
   6144         nx = len(x)  # number of datasets

/home/andrew/anaconda3/lib/python3.6/site-packages/matplotlib/axes/ in _normalize_input(inp, ename)
   6080                 else:
   6081                     raise ValueError(
-> 6082                         "{ename} must be 1D or 2D".format(ename=ename))
   6083                 if inp.shape[1] < inp.shape[0]:
   6084                     warnings.warn(

ValueError: x must be 1D or 2D

_normalize_input() was removed from Matplotlib (it looks like sometime last year), so I guess Seaborn is referring to an older version under the hood.

You can see _normalize_input() in this old commit:

def _normalize_input(inp, ename='input'):
        """Normalize 1 or 2d input into list of np.ndarray or
        a single 2D np.ndarray.
        inp : iterable
        ename : str, optional
            Name to use in ValueError if `inp` can not be normalized
        if (isinstance(x, np.ndarray) or
                not iterable(cbook.safe_first_element(inp))):
            # TODO: support masked arrays;
            inp = np.asarray(inp)
            if inp.ndim == 2:
                # 2-D input with columns as datasets; switch to rows
                inp = inp.T
            elif inp.ndim == 1:
                # new view, single row
                inp = inp.reshape(1, inp.shape[0])
                raise ValueError(
                    "{ename} must be 1D or 2D".format(ename=ename))

I can't figure out why inp.ndim!=1, though. Performing the same np.asarray().ndim on the input returns 1 as expected:

np.asarray(df.floatTime).ndim  # 1

So you're facing a few obstacles if you want to make a single-valued input work with sns.distplot().

Suggested Workaround
Check for a single-element df.floatTime, and if that's the case, just use plt.hist() instead (which is what distplot goes to anyway, along with KDE):


single element histogram

Upvotes: 1

Related Questions