How to apply ML model to new rows in dataset?

Question

Suppose that I have a dataset and build a ML model. This dataset is updated weekly and, after that, I want to, when he updated, my model predict for new rows that appears and append it to original dataset. How I made this?

This what I tried:

import pandas as pd
import numpy as np
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.svm import SVC

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
df = pd.read_csv(url, names=names)
df

array = df.values
X = array[:,0:4]
y = array[:,4]
X_train, X_validation, Y_train, Y_validation = train_test_split(X, y, test_size=0.20, random_state=1)

I skip some steps where I check the score for different models.

model = SVC(gamma='auto')
model.fit(X_train, Y_train)
predictions = model.predict(X_validation)

Here I add new data to make my test:

new_data = [[5.9, 3.0, 5.7, 1.5], [4.8, 2.9, 3.0, 1.2]]
df2 = pd.DataFrame(new_data,  columns =  ["sepal-length", "sepal-width", "petal-length", "petal-width"])

df3 = df.append(df2, ignore_index=True)
df3

array2 = df3.values
X2 = array2[:,0:4]
predict = model.predict(X2)
predict

df3['pred'] = predict

def final_class(row):
    if pd.isnull(row['class']):
        return row['pred']
    else:
        return row['class']

df3['final_class'] = df3.apply(lambda x: final_class(x), axis=1)
df3

Works, but I think that is not the best way to do it. Can someone help me?

Vit · Accepted Answer

It's the right way.

Also you can do like, predict on new dataset only & append the predicted result to initially predicted dataset.

How to apply ML model to new rows in dataset?

Answers (1)

Related Questions