Reputation: 5993
I am facing a trouble with seaborn.pairplot() with the below code
I have a dataframe and in one case I have to convert one of the column to string; After converting to String. Pairplot() is not working properly. How to fix the issue.
Below is the code,
import numpy as np
from pandas import DataFrame
import seaborn as sns
%matplotlib inline
Index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
Cols = ['A', 'B', 'C', 'D']
df_temp = DataFrame(abs(np.random.randn(5, 4)), index=Index, columns=Cols)
print(df_temp)
sns.pairplot(df_temp) # This works
# convert one of the column to String datatype
df_temp['A'] = df_temp['A'].astype(str)
sns.pairplot(df_temp) # Gives error
Complete error log - Error log
Upvotes: 0
Views: 1674
Reputation: 339300
On the diagonal of a pairplot there are histograms. It is not possible to draw histrograms from strings. Since I'm not sure what you would want to show on the diagonal instead in such case, let's leave that out and simply plot a pair grid from the dataframe which contains strings in one column,
import matplotlib.pyplot as plt
import numpy as np
from pandas import DataFrame
import seaborn as sns
Index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
Cols = ['A', 'B', 'C', 'D']
df = DataFrame(abs(np.random.randn(5, 4)), index=Index, columns=Cols)
df['A'] = list("VWXYZ")
g = sns.PairGrid(df, vars=df.columns, height=2)
g.map_offdiag(sns.scatterplot)
plt.show()
If instead the aim is to just use numeric columns, you can filter the dataframe by dtype.
import matplotlib.pyplot as plt
import numpy as np
from pandas import DataFrame
import seaborn as sns
Index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
Cols = ['A', 'B', 'C', 'D']
df = DataFrame(abs(np.random.randn(5, 4)), index=Index, columns=Cols)
# convert one of the column to String datatype
df['A'] = df['A'].astype(str)
sns.pairplot(df.select_dtypes(include=[np.number]))
plt.show()
Upvotes: 2
Reputation: 6483
import numpy as np
from pandas import DataFrame
import seaborn as sns
%matplotlib inline
Index= ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
Cols = ['A', 'B', 'C', 'D']
df_temp = DataFrame(abs(np.random.randn(5, 4)), index=Index, columns=Cols)
print(df_temp)
# convert one of the column to String datatype
df_temp['A'] = df_temp['A'].astype(str)
You can find all the columns of type float and plot only those.
cols_to_plot=df_temp[df_temp.types=='float']#find not strings
sns.pairplot(df_temp[cols_to_plot[cols_to_plot==1].index])
Upvotes: 1