Reputation: 148
While trying to just get basic scatter plot code to work, I ran into the much talked about:
Error:
ValueError: x and y must be the same size.
And yet, the answers I am finding on here do not seem to resolve this one. Here's the code. Can anyone spot what I am doing wrong?
The data:
import pandas as pd
iris = pd.read_csv('../week2/data/iris.csv')
iris.head()
produces output like this:
output
Scatter plot code:
%matplotlib inline
import matplotlib.pyplot as plt
PetalLength = iris['Petal.Length']
PetalWidth = iris['Petal.Width']
plt.rcParams['figure.figsize'] = 8, 6
plt.scatter(iris, PetalWidth, PetalLength)
plt.show
I ran this code to look into what the error appeared to be saying but everything looks the same:
print(PetalWidth.shape, PetalLength.shape)
print(type(PetalWidth), type(PetalLength))
print(len(PetalWidth), len(PetalLength))
Above outputs this:
((150L,), (150L,))
(<type 'numpy.ndarray'>, <type 'numpy.ndarray'>)
(150, 150)
Final details in case this helps. I tried converting PetalWidth and PetalLength to lists based on a Stack Overflow post I found but that did not help either. Any guidance to help me get this code working would be appreciated.
Upvotes: 2
Views: 1086
Reputation: 1625
Comments on this post appear to contain the answer. While some plot types require a data set to be passed in and then x
and y
are fields in the data set, scatter takes just x
and y
arguments where x and y are arrays of equal length that contain the data. The error is most likely being thrown because the entire data set is being treated as x
in your example, and then the second argument (that you thought was x
) is being treated as y
. In that comparison, the shape of the whole data set is what is triggering the error.
Delete the first argument (for the data set) and see if the problem goes away.
For others who stumble upon this post in the future, others on Stack Overflow have encountered this error for passing in x
and y
where x
and y
were not arrays. There is even one post where someone fixed the problem by converting x
and y
to lists, but that is probably not the recommended solution. Finally, if the arrays, x
and y
do not contain the same number of values (and hence have the same length), then this error will definitely occur.
Upvotes: 1