Reputation: 533
I would like to create a plot of my linear regression model showing bike sales for each year summed up at one point, and not like now that there are two points separately.
This is my code:
from sklearn.linear_model import LinearRegression
from sklearn import datasets, linear_model
## Wzrost lub maleje zakup rowerow
## (Purchase of bicycles increases or decreases)
plot1 = df.groupby('Year')['Product_Category'].value_counts().rename('count').reset_index()
x = plot1['Year'].values.reshape(-1, 1)
y = plot1['count'].values.reshape(-1, 1)
# plot #
## linear ##
regr = linear_model.LinearRegression()
regr.fit(x, y)
y_pred = regr.predict(x_test)
#plot#
plt.scatter(x, y, color='black')
plt.plot(x, y, color='blue', linewidth=3)
This is my plot:
Upvotes: 0
Views: 36
Reputation: 672
As what I can understand from your example, this maybe a solution, replace value_counts
by count
.
Example data:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Year': [ 2019, 2019, 2020, 2021], 'Product_Category': ['a', 'b', 'c', 'd']})
print(df)
Year Product_Category
0 2019 a
1 2019 b
2 2020 c
3 2021 d
The count will return:
plot1 = df.groupby('Year')['Product_Category'].count().rename('count').reset_index()
print(plot1)
Year count
0 2019 2
1 2020 1
2 2021 1
plot1 = df.groupby('Year')['Product_Category'].count().rename('count').reset_index()
#x,y#
x = plot1['Year'].values.reshape(-1, 1)
y = plot1['count'].values.reshape(-1, 1)
# plot #
#plot#
plt.scatter(x, y, color='black')
plt.plot(x, y, color='blue', linewidth=3)
Upvotes: 1