Reputation: 425
I created a random dataFrame simulating the dataset tips from seaborn:
import numpy as np
import pandas as pd
time = ['day','night']
sex = ['female','male']
smoker = ['yes','no']
for t in range(0,len(time)):
for s in range(0,len(sex)):
for sm in range(0,len(smoker)):
randomarray = np.random.rand(10)*10
if t == 0 and s == 0 and sm == 0:
df = pd.DataFrame(index=np.arange(0,len(randomarray)),columns=["total_bill","time","sex","smoker"])
L = 0
for i in range(0,len(randomarray)):
df.loc[i] = [randomarray[i], time[t], sex[s], smoker[sm]]
L = L + 1
else:
for i in range(0,len(randomarray)):
df.loc[i+L] = [randomarray[i], time[t], sex[s], smoker[sm]]
L = L + 1
My dataFrame df has, for each column, the same type of class as the dataFrame tips from seaborn's dataset:
tips = sns.load_dataset("tips")
type(tips["total_bill"][0])
type(tips["time"][0])
numpy.float64
str
And so on for the other columns. Same as my dataFrame:
type(df["total_bill"][0])
type(tips["time"][0])
numpy.float64
str
However, when I try to use seaborn's violinplot or factorplot following the documentation:
g = sns.factorplot(x="sex", y="total_bill", hue="smoker", col="time", data=df, kind="violin", split=True, size=4, aspect=.7);
I have no problems if I use the dataFrame tips, but when I use my dataFrame I get:
AttributeError: 'float' object has no attribute 'shape'
I Imagine this is an issue with the way I pass the array into the dataFrame, but I couldn't find what is the problem since every issue I found on the internet with the same AttributeError says it's because it's not the same type of class, and as shown above my dataFrame has the same type of class as the one in seaborn's documentation.
Any suggestions?
Upvotes: 1
Views: 12471
Reputation: 1
I had a different issue in my code that produced the same error:
'str' object has no attribute 'get'
For me, I had in my seaborn syntax ...data='df'...
where df
is an object, however, and should not be in quotes. Once I removed the quotes, my program worked perfectly. I made the mistake, as someone else might, because the x= and y= parameters are in quotes (for the columns in the dataframe)
Upvotes: -1
Reputation: 461
I got the same problem and was trying to find a solution but did not see the answer I was looking for. So I guess provide an answer here may help people like me.
The problem here is that the type of df.total_bill is object instead of float.
So the solution is to change it to float befor pass the dataframe to seaborn:
df.total_bill = df.total_bill.astype(float)
Upvotes: 8
Reputation: 339340
This is a rather unusual way of creating a dataframe. The resulting dataframe also has some very strange properties, e.g. it has a length of 50 but the last index is 88. I'm not going into debugging these nested loops. Instead, I would propose to create the dataframe from some numpy array, e.g. like
import numpy as np
import pandas as pd
time = ['day','night']
sex = ['female','male']
smoker = ['yes','no']
data = np.repeat(np.stack(np.meshgrid(time, sex, smoker), -1).reshape(-1,3), 10, axis=0)
df = pd.DataFrame(data, columns=["time","sex","smoker"])
df["total_bill"] = np.random.rand(len(df))*10
Then also plotting works fine:
g = sns.factorplot(x="sex", y="total_bill", hue="smoker", col="time", data=df,
kind="violin", size=4, aspect=.7)
Upvotes: 1