user11435815
user11435815

Reputation:

x and y must be the same size

Using python I'm trying to plot a sin wave and random distribution, then show where the ratio is greater than or equal to 3.

I think I'm 90% of the way there but keep getting the error message 'x and y must be the same size' when I try to plot it. I've been racking my brains but can't figure out what I'm missing.

Any help or pointers gratefully received.

import numpy as np
import math
import matplotlib.pyplot as plt

r= 2*math.pi
dev = 0.1
x = np.array(np.arange(0, r, dev))
y1 = np.array(np.sin(x))
y2 = np.array(np.random.normal(loc=0, scale=0.1, size=63))

mask = y1//y2 >= 3

fit = np.array(x[mask])

print(fit)


plt.plot(x, y1)
plt.scatter(x, fit)
plt.scatter(x, y2, marker=".")
plt.show()

Upvotes: 1

Views: 681

Answers (4)

Dhivya Bharkavi
Dhivya Bharkavi

Reputation: 25

    """## Splitting the dataset into the Training set and Test set"""
    
    from sklearn.model_selection import train_test_split
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 1/3, random_state = 0)

print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

"""## Training the Simple Linear Regression model on the Training set"""

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

"""## Predicting the Test set results"""

y_pred = regressor.predict(X_test)

"""## Visualising the Training set results"""

plt.scatter(X_train, y_train, color = 'green')
plt.plot(X_train, regressor.predict(X_train), color = 'yellow')
plt.title('Doctor visits(Training set)')
plt.xlabel('X')
plt.ylabel('Y')
plt.show()

"""## Visualising the Test set results"""

plt.scatter(X_test, y_test, color = 'red')
plt.plot(X_train, regressor.predict(X_train), color = 'blue')
plt.title('Doctor visits (Test set)')
plt.xlabel('X')
plt.ylabel('Y')
plt.show()

Upvotes: 0

Jonas J.
Jonas J.

Reputation: 26

In your line plt.scatter(x, fit) you are trying to scatter your x-values with your fit-values. However fit is only of size 25 file while x is of size 63 (as are y1 and y2 btw., thats why that part works).

mask is basically an array of False or True values. That means if you use the np.array(x[mask]) function. It will only create an array of the values where x is actually True, which seems to be what you want. But you can only scatter this against something like np.array(np.sin(fit)), otherwise the sizes are incompatible to scatter.

Upvotes: 0

Martin Gustafsson
Martin Gustafsson

Reputation: 303

Not sure if this is what you want but this will scatter dots on the sin-curve corresponding to your mask.

import numpy as np
import math
import matplotlib.pyplot as plt

r= 2*math.pi
dev = 0.1
x = np.array(np.arange(0, r, dev))
y1 = np.array(np.sin(x))
y2 = np.array(np.random.normal(loc=0, scale=0.1, size=63))

mask = y1//y2 >= 3

fit_x = np.array(x[mask])
fit_y = np.array(y1[mask])


plt.plot(x, y1)
plt.scatter(fit_x, fit_y)
plt.scatter(x, y2, marker=".")
plt.show()

Upvotes: 0

Prune
Prune

Reputation: 77847

Insert this line into your code, just before the point of error:

print(len(x), len(fit))

Output:

63 28

You explicitly removed elements from your sequence, and then expected them to be of the same size. You still have 63 x values, but now only 28 y values. Since you didn't trace the problem and explain what you intend for this scatter plot, I have no way of knowing what a "fix" might be. Perhaps make a list of point (x-y pairs), and then filter that for the appropriate y1/y2 ratio?

Upvotes: 1

Related Questions