FutureDataScientist
FutureDataScientist

Reputation: 55

How to use .predict() in a Linear Regression model?

I'm trying to predict what a 15-minute delay in flight departure does to the flight's arrival time. I have thousands of rows as well as several columns in a DF. Two of these columns are dep_delay and arr_delay for departure delay and arrival delay. I have built a simple LinearRegression model:

y = nyc['dep_delay'].values.reshape((-1, 1))

arr_dep_model = LinearRegression().fit(y, nyc['arr_delay'])

And now I'm trying to find out the predicted arrival delay if the flights departure was delayed 15 minutes. How would I use the model above to predict what the arrival delay would be?

My first thought was to use a for loop / if statement, but then I came across .predict() and now I'm even more confused. Does .predict work like a boolean, where I would use "if departure delay is equal to 15, then arrival delay equals y"? Or is it something like:

arr_dep_model.predict(y)?

Upvotes: 2

Views: 4921

Answers (2)

Abhishek
Abhishek

Reputation: 1625

Hi Check below code for an example:`

import pandas as pd
import random 
from sklearn.linear_model import LinearRegression

df = pd.DataFrame({'x1':random.choices(range(0, 100), k=10), 'x2':random.choices(range(0, 100), k=10)})

df['y'] = df['x2'] * .5

X_train = df[['x1','x2']][:-3].values #Training on top 7 rows
y_train = df['y'][:-3].values #Training on top 7 rows

X_test = df[['x1','x2']][-3:].values # Values on which prediction will happen - bottom 3 rows

regr = LinearRegression()
regr.fit(X_train, y_train)

regr.predict(X_test)

If you will notice X_test the data on which prediction is happening is of same shape as (number of columns) as X_train both have two columns ['X1','X2']. Same has been converted in array when .values is used. You can create your own data (2 column dataframe for current example) & can use that for prediction (because 3rd column is need to be predicted).

Output will be three values as predicted on three rows:

enter image description here

Upvotes: 1

Cardstdani
Cardstdani

Reputation: 5223

When working with LinearRegression models in sklearn you need to perform inference with the predict() function. But you also have to ensure the input you pass to the function has the correct shape (the same as the training data). You can learn more about the proper use of predict function in the official documentation.

arr_dep_model.predict(youtInput)

This line of code would output a value that the model predicted for a corresponding input. You can insert this into a for loop and traverse a set of values to serve as the model's input, it depends on the needs for your project and the data you are working with.

Upvotes: 1

Related Questions