Reputation: 1529
When I tried plotting a pandas dataframe in seaborn I got an DataError. I fixed the problem by recreating the dataframe from a Dictionary instead of using lists and a for loop. However, I still don't understand why I got the error in the first case. The two data frames look identical to me. Can somebody explain what happens here?
# When I create two seemingly identical data frames.
x = [0, 1, 2]
y = [3, 5, 7]
line_df1 = pd.DataFrame(columns=['x','y'])
for i in range(3):
line_df1.loc[i] = [x[i], y[i]]
line_dict = {'x': [0, 1, 2], 'y': [3, 5, 7]}
line_df2 = pd.DataFrame(line_dict)
# they look identical when printed
print(line_df1)
print(line_df2)
>> x y
>> 0 0 3
>> 1 1 5
>> 2 2 7
>> x y
>> 0 0 3
>> 1 1 5
>> 2 2 7
# This first one throws a DataError...
sns.lineplot('x', 'y', data=line_df1)
# ..but this one does not.
sns.lineplot('x', 'y', data=line_df2)
Upvotes: 2
Views: 213
Reputation: 862611
Problem is first values are objects, verified by DataFrame.dtypes
:
print(line_df1.dtypes)
x object
y object
dtype: object
print(line_df2.dtypes)
x int64
y int64
dtype: object
Solution for correct working first solution is set dtype
of empty DataFrame
:
line_df1 = pd.DataFrame(columns=['x','y'], dtype=int)
But if performance is important, better is second solution, because update empty DataFrame
is last instance:
6) updating an empty frame (e.g. using loc one-row-at-a-time)
Upvotes: 2