Vamsi
Vamsi

Reputation: 5993

seaborn pairplot after converting a integer column to string

I am facing a trouble with seaborn.pairplot() with the below code

I have a dataframe and in one case I have to convert one of the column to string; After converting to String. Pairplot() is not working properly. How to fix the issue.

Below is the code,

import numpy as np 
from pandas import DataFrame
import seaborn as sns
%matplotlib inline

Index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
Cols = ['A', 'B', 'C', 'D']
df_temp = DataFrame(abs(np.random.randn(5, 4)), index=Index, columns=Cols)

print(df_temp)

sns.pairplot(df_temp) # This works

# convert one of the column to String datatype
df_temp['A'] = df_temp['A'].astype(str)
sns.pairplot(df_temp) # Gives error

Complete error log - Error log

Upvotes: 0

Views: 1674

Answers (2)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339300

On the diagonal of a pairplot there are histograms. It is not possible to draw histrograms from strings. Since I'm not sure what you would want to show on the diagonal instead in such case, let's leave that out and simply plot a pair grid from the dataframe which contains strings in one column,

import matplotlib.pyplot as plt
import numpy as np 
from pandas import DataFrame
import seaborn as sns


Index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
Cols = ['A', 'B', 'C', 'D']
df = DataFrame(abs(np.random.randn(5, 4)), index=Index, columns=Cols)
df['A'] = list("VWXYZ")

g = sns.PairGrid(df, vars=df.columns, height=2)
g.map_offdiag(sns.scatterplot)

plt.show()

enter image description here

If instead the aim is to just use numeric columns, you can filter the dataframe by dtype.

import matplotlib.pyplot as plt
import numpy as np 
from pandas import DataFrame
import seaborn as sns

Index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
Cols = ['A', 'B', 'C', 'D']
df = DataFrame(abs(np.random.randn(5, 4)), index=Index, columns=Cols)


# convert one of the column to String datatype
df['A'] = df['A'].astype(str)
sns.pairplot(df.select_dtypes(include=[np.number])) 

plt.show()

enter image description here

Upvotes: 2

CAPSLOCK
CAPSLOCK

Reputation: 6483

import numpy as np 
from pandas import DataFrame
import seaborn as sns
%matplotlib inline

Index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
Cols = ['A', 'B', 'C', 'D']
df_temp = DataFrame(abs(np.random.randn(5, 4)), index=Index, columns=Cols)

print(df_temp)

# convert one of the column to String datatype
df_temp['A'] = df_temp['A'].astype(str)

You can find all the columns of type float and plot only those.

cols_to_plot=df_temp[df_temp.types=='float']#find not strings

sns.pairplot(df_temp[cols_to_plot[cols_to_plot==1].index]) 

Upvotes: 1

Related Questions