Predicting the square root of a number using Machine Learning

Question

I am trying to create a program in python that uses machine learning to predict the square root of a number. I am listing what all I have done in my program:-

created a csv file with numbers and their squares
extracted the data from csv into suitable variables (X stores squares, y stores numbers)
scaled the data using sklearn's, StandardScaler
built the ANN with two hidden layers each of 6 units (no activation functions)
compiled the ANN using SGD as the optimizer and mean squared error as the loss function
trained the model. Loss was around 0.063
tried predicting but the result is something else.

My actual code:-

import numpy as np
import tensorflow as tf
import pandas as pd

df = pd.read_csv('CSV/SQUARE-ROOT.csv')

X = df.iloc[:, 1].values
X = X.reshape(-1, 1)
y = df.iloc[:, 0].values
y = y.reshape(-1, 1)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_test_sc = sc.fit_transform(X_test)
X_train_sc = sc.fit_transform(X_train)
sc1 = StandardScaler()
y_test_sc1 = sc1.fit_transform(y_test)
y_train_sc1 = sc1.fit_transform(y_train)

ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=6))
ann.add(tf.keras.layers.Dense(units=6))
ann.add(tf.keras.layers.Dense(units=1))

ann.compile(optimizer='SGD', loss=tf.keras.losses.MeanSquaredError())

ann.fit(x = X_train_sc, y = y_train_sc1, batch_size=5, epochs = 100)

print(sc.inverse_transform(ann.predict(sc.fit_transform([[144]]))))

OUTPUT:- array([[143.99747]], dtype=float32)

Shouldn't the output be 12? Why is it giving me the wrong result?

I am attaching the csv file I used to train my model as well: SQUARE-ROOT.csv

wong.lok.yin · Accepted Answer

The reason your code does not work is because you apply fit_transform to your test set, which is wrong. You can fix it by replacing fit_transform(test) to transform(test). Although I don't think StandardScaler is neccessary, please try this:

import numpy as np
import tensorflow as tf
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

N = 10000
X = np.arange(1, N).reshape(-1, 1)
y = np.sqrt(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2)


sc = StandardScaler()
X_train_sc = sc.fit_transform(X_train)    
#X_test_sc = sc.fit_transform(X_test)      # wrong!!!
X_test_sc = sc.transform(X_test)

sc1 = StandardScaler()       
y_train_sc1 = sc1.fit_transform(y_train)    
#y_test_sc1 = sc1.fit_transform(y_test)   # wrong!!!
y_test_sc1 = sc1.transform(y_test)

ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=32, activation='relu'))    # you have 10000 data, maybe you need a little deeper network
ann.add(tf.keras.layers.Dense(units=32, activation='relu'))
ann.add(tf.keras.layers.Dense(units=32, activation='relu'))
ann.add(tf.keras.layers.Dense(units=1))

ann.compile(optimizer='SGD', loss='MSE')
ann.fit(x=X_train_sc, y=y_train_sc1, batch_size=32, epochs=100, validation_data=(X_test_sc, y_test_sc1))

#print(sc.inverse_transform(ann.predict(sc.fit_transform([[144]]))))  # wrong!!!
print(sc1.inverse_transform(ann.predict(sc.transform([[144]]))))

Predicting the square root of a number using Machine Learning

Answers (2)

Related Questions