user10543278
user10543278

Reputation:

Tensorflow - How to read the prediction

I have a problem. For a school project, I created a Recurrent Neural Network (RNN) where I wanted to predict if the stock price will rise or fall. I also have some data from a CSV file. Training went fine, so I was ready to predict some tests. From the RNN I get some results because it has multiple predictions over a period of a week. Here is my code:

import io
import requests
import os
import time
import random

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

from sklearn.metrics import mean_absolute_error
from sklearn import preprocessing
from collections import deque

#Constant Variables
SEQ_LEN = 30
FUTURE_PERIOD_PREDICT = 3
RATIO_TO_PREDICT = "LTC-USD"
BATCH_SIZE = 64
NAME = str(RATIO_TO_PREDICT) + "-" + str(SEQ_LEN) + "-SEQ-" + str(FUTURE_PERIOD_PREDICT) + "-PRED-" + str(int(time.time()))
ACTIONS = ["Sell", "Buy"]

def classify(current, future):
    if float(future) > float(current):
        return 1
    else:
        return 0

def preprocess_df(df):
    df = df.drop('future', 1)

    for col in df.columns:
        if col != "target":
            df[col] = df[col].pct_change()
            df.dropna(inplace=True)
            #df[col] = preprocessing.scale(df[col].values)

    df.dropna(inplace=True)

    sequential_data = []
    prev_days = deque(maxlen=SEQ_LEN)



    for i in df.values:
        prev_days.append([n for n in i[:-1]])
        if len(prev_days) == SEQ_LEN:
            sequential_data.append([np.array(prev_days), i[-1]])

    buys = []
    sells = []

    for seq, target in sequential_data:
        if target == 0:
            sells.append([seq, target])
        elif target == 1:
            buys.append([seq, target])


    random.shuffle(buys)
    random.shuffle(sells)

    lower = min(len(buys), len(sells))


    buys = buys[:lower]
    sells = sells[:lower]


    sequential_data = buys+sells

    x = []
    y = []

    for seq, target in sequential_data:
        x.append(seq)
        y.append(target)

    return np.array(x), y




main_df = pd.DataFrame()

ratios = ["BTC-USD", "LTC-USD", "ETH-USD"]
for ratio in ratios:


    url="https://www.test.nl/get_csv_data_onscreen.php?method=test&ratio=" + str(ratio)
    dataset = requests.get(url, verify=False).content
    df = pd.read_csv(io.StringIO(dataset.decode('utf-8')), names=["time", "low", "high", "open", "close", "volume", "rsi14", "ma5", "ema5", "ema12", "ema20", "macd", "signal"])

    df.rename(columns={"close": str(ratio)+"_close", "volume": str(ratio) + "_volume", "rsi14": str(ratio) + "_rsi14", "ma5": str(ratio) + "_ma5", "ema5": str(ratio) + "_ema5", "ema12": str(ratio) + "_ema12", "ema20": str(ratio) + "_ema20", "macd": str(ratio) + "_macd", "signal": str(ratio) + "_signal"}, inplace=True)

    df.set_index("time", inplace=True)
    df = df[[str(ratio) + "_close", str(ratio) + "_volume", str(ratio) + "_rsi14", str(ratio) + "_ma5", str(ratio) + "_ema5", str(ratio) + "_ema12", str(ratio) + "_ema20", str(ratio) + "_macd", str(ratio) + "_signal"]]

    if len(main_df) == 0:
        main_df = df
    else:
        main_df = main_df.join(df)


main_df['future'] = main_df[str(RATIO_TO_PREDICT) + "_close"].shift(-FUTURE_PERIOD_PREDICT)
main_df['target'] = list(map(classify, main_df[str(RATIO_TO_PREDICT) + "_close"], main_df["future"]))
#print(main_df[[str(RATIO_TO_PREDICT) + "_close", "future", "target"]].head(10))


times = sorted(main_df.index.values)
last_5pct = times[-int(0.05*len(times))]

validation_main_df = main_df[(main_df.index >= last_5pct)]
main_df = main_df[(main_df.index < last_5pct)]

test_x, test_y = preprocess_df(main_df)
validation_x, validation_y = preprocess_df(validation_main_df)

model = tf.keras.models.load_model("models\Crypto_Model_0.6337.h5")

predictions = model.predict(test_x)
print(predictions)
print(ACTIONS[int(prediction[0][0])])

So when I print the predictions I get a list of numbers arround 0 and 1. Here is a short version of the result:

[[ 0.61009574]
 [ 0.5243717 ]
 [ 0.56290686]
 [ 0.49165   ]
 [ 0.50527   ]
 [ 0.77428705]
 [ 0.62151164]
 [ 0.55098933]
 [ 0.45642132]
 [ 0.61239064]
 [ 0.69220203]
 [ 0.3707057 ]
 [ 0.5335519 ]
 [ 0.43078205]
 [ 0.57520276]
 [ 0.46626005]
 [ 0.37625414]
 [ 0.56013215]]

But what is the latest datapoint. For example, here is a part of the list I upload:

1535782500,63.41,63.63,63.47,63.52,83505,55.104896,63.574000,63.586200,63.61220000,63.454000,0.31080000,0.44500684
1535783400,63.44,63.74,63.52,63.62,95980,56.921744,63.578000,63.597500,63.61340000,63.469800,0.28840000,0.41370000
1535784300,63.62,63.86,63.64,63.81,71996,60.216065,63.616000,63.668300,63.64360000,63.502200,0.28270000,0.38750000
1535785200,63.71,64.00,63.83,63.82,101652,60.387764,63.644000,63.718900,63.67070000,63.532500,0.27580000,0.36520000
1535786100,63.64,63.87,63.82,63.84,78686,60.752590,63.722000,63.759300,63.69670000,63.561800,0.26880000,0.34590000
1535787000,63.76,63.88,63.84,63.84,82486,60.752590,63.786000,63.786200,63.71870000,63.588300,0.26030000,0.32880000
1535787900,63.70,63.89,63.84,63.72,71654,57.093572,63.806000,63.764100,63.71890000,63.600800,0.24110000,0.31130000
1535788800,63.69,63.87,63.73,63.76,88931,58.001593,63.796000,63.762700,63.72520000,63.616000,0.22650000,0.29430000
1535789700,63.71,63.86,63.79,63.82,87103,59.389894,63.796000,63.781800,63.73980000,63.635400,0.21730000,0.27890000
1535790600,63.76,63.97,63.77,63.89,102919,61.009256,63.806000,63.817900,63.76290000,63.659600,0.21320000,0.26580000

I input 1 week of data of 15 minutes, So that is 672 rows. So just to be clear....

Is the last value of predictions the prediction of the last row from the csv file?

Upvotes: 1

Views: 282

Answers (1)

colbyham
colbyham

Reputation: 567

Why are you shuffling the sequential time data? The date/time index should be in each row and tell you what day it is predicting. Shuffling sequential data isn't heavily recommended for training RNNs or LSTMs. It also appears like you are trying to apply reinforcement learning which I would always recommend avoiding for training, you can get a few lucky actions and the model will just remember the data points, not generalize an algorithm.

Upvotes: 1

Related Questions