How to create a model on time series data and update it?

Question

I have a large dataset of 23k rows. That data looks like something below:

import pandas as pd

d = {'Date': ["1-1-2020", '1-1-2020', "1-2-2020", "1-2-2020"], 'Stock': ["FB", "F", "FB", "F"], 
     "last_price": [230,8,241,9], "price":[241,9,240,8.5]}
df = pd.DataFrame(data=d)

    Date      Stock_id   last_price  price
0   1-1-2020  5           230        241.0
1   1-1-2020  41          8          9.0
2   1-2-2020  5           241        240.0
3   1-2-2020  41          9          8.5

Note that data includes many stocks on many different dates. How can I create a model that uses the feature for example last_price and stock id to predict next-day price? And that uses the old data to re-train the data.

Now, this was the best thing I could do. I used LinearRegression but any other model advice can work.

X = df[['Stock_id', 'last_price']]
y = df[['price']]

from sklearn import metrics
from sklearn.model_selection import train_test_split
from sklearn import linear_model  

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)
lm = linear_model.LinearRegression()
lm.fit(X_train, y_train)

y_pred = lm.predict(X_test)
result = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})

Index   Actual  Predicted
487     45      32
4154    420     512

Is there a way where the model is trained on the first 3000 rows? Then the model makes a prediction for say date 12-11-2020 and then adds 12-11-2020 info to make the prediction for 12-12-2020 and so on?

I was hoping to get something like this.

Date       Actual   Predicted
12-11-2020  45      32
12-11-2020  420     512
12-12-2020  43      34
12-12-2020  423     513

How to create a model on time series data and update it?

Answers (1)

Related Questions