Tom
Tom

Reputation: 75

Plot using seaborn with FacetGrid where values are ndarray in dataframe

I want to plot a dataframe where y values are stored as ndarrays within a column i.e.:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame(index=np.arange(0,4), columns=('sample','class','values'))
for iloc in [0,2]:
    df.loc[iloc] = {'sample':iloc, 
                    'class':'raw', 
                    'values':np.random.random(5)}
    df.loc[iloc+1] = {'sample':iloc,
                      'class':'predict',
                      'values':np.random.random(5)}

grid = sns.FacetGrid(df, col="class", row="sample")
grid.map(plt.plot, np.arange(0,5), "value")

TypeError: unhashable type: 'numpy.ndarray'

Do I need to break out the ndarrays into separate rows? Is there a simple way to do this?

Thanks

Upvotes: 1

Views: 859

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339102

This is quite an unusual way of storing data in a dataframe. Two options (I'd recommend option B):

A. Custom mapping in seaborn

Indeed seaborn does not support such format natively. You may construct your own function to plot to the grid though.

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame(index=np.arange(0,4), columns=('sample','class','values'))
for iloc in [0,2]:
    df.loc[iloc] = {'sample':iloc, 
                    'class':'raw', 
                    'values':np.random.random(5)}
    df.loc[iloc+1] = {'sample':iloc,
                      'class':'predict',
                      'values':np.random.random(5)}

grid = sns.FacetGrid(df, col="class", row="sample")

def plot(*args,**kwargs):
    plt.plot(args[0].iloc[0], **kwargs)

grid.map(plot, "values")

B. Unnesting

However I would advise to "unnest" the dataframe first and get rid of the numpy arrays inside the cells.

pandas: When cell contents are lists, create a row for each element in the list shows a way to do that.

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame(index=np.arange(0,4), columns=('sample','class','values'))
for iloc in [0,2]:
    df.loc[iloc] = {'sample':iloc, 
                    'class':'raw', 
                    'values':np.random.random(5)}
    df.loc[iloc+1] = {'sample':iloc,
                      'class':'predict',
                      'values':np.random.random(5)}

res = df.set_index(["sample", "class"])["values"].apply(pd.Series).stack().reset_index()
res.columns = ["sample", "class", "original_index", "values"]

enter image description here

Then use the FacetGrid in the usual way.

grid = sns.FacetGrid(res, col="class", row="sample")
grid.map(plt.plot, "original_index", "values")

Upvotes: 1

Related Questions