Ani
Ani

Reputation: 2988

boxplot error: 1 ndim Categorical are not supported at this time

I have two numpy arrays as follows

clf_scores = numpy.array(
      [[ 0.66333666,  0.65634366,  0.63836164,  0.64435564,  0.658     ,
         0.641     ,  0.67167167,  0.66066066,  0.67167167,  0.65165165],
       [ 0.6983017 ,  0.70629371,  0.70529471,  0.68331668,  0.702     ,
         0.688     ,  0.71371371,  0.69269269,  0.70770771,  0.6996997 ],
       [ 0.65934066,  0.68531469,  0.65834166,  0.66333666,  0.677     ,
         0.668     ,  0.68568569,  0.68668669,  0.6996997 ,  0.68168168],
         ....         ....         ....         ....         ....
       [ 0.68731269,  0.71928072,  0.7002997 ,  0.70929071,  0.723     ,
         0.697     ,  0.68968969,  0.71271271,  0.72672673,  0.6996997 ],
       [ 0.68731269,  0.72027972,  0.6973027 ,  0.70729271,  0.726     ,
         0.695     ,  0.68568569,  0.71271271,  0.72572573,  0.6996997 ],
       [ 0.69030969,  0.71728272,  0.6983017 ,  0.70929071,  0.725     ,
         0.698     ,  0.68668669,  0.71371371,  0.72572573,  0.6996997 ]])

and

Trees = numpy.array(
      [  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
        14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,
        27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,  39,
        40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,  52,
        53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,  65,
        66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,
        79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,  91,
        92,  93,  94,  95,  96,  97,  98,  99, 100])

These arrays have shapes (100,10) and (100,)

How do I Plot these two arrays using seaborn.boxplot ?

I tried to boxplot these two numpy arrays as follows

sns.boxplot(clf_scores,Trees)

however I got following error

NotImplementedError: > 1 ndim Categorical are not supported at this time

Please tell how do I correct it to get appropriate boxplot ?

PS: data set is obtained by finding cross_val_score of RandomForestClassifier with nTrees = 100

Correct output is somewhat as followsenter image description here

Upvotes: 0

Views: 939

Answers (1)

Tim Tröndle
Tim Tröndle

Reputation: 465

The easiest to me is to convert your data to a pandas dataframe first and then plot it using seaborn:

 import numpy as np
 import pandas as pd
 import seaborn as sns

 df = pd.DataFrame(np.transpose(clf_scores))
 sns.boxplot(data=df)

The dataframe df corresponds to the 'wide-form Dataframe' as described in the boxplot documentation. In your approach, seaborn got the format of the data wrong and assumed it to be a categorical, which is not the case.

Upvotes: 2

Related Questions