Wednesday, August 7, 2019

Predicting Future Stock Price using RNN

Master

Predicting Future using RNN

In this Tutorial i will be teaching you how to predict Future Stock Values using RNN Network

Soumil Nitin Shah

Bachelor in Electronic Engineering | Masters in Electrical Engineering | Master in Computer Engineering |

Hello! I’m Soumil Nitin Shah, a Software and Hardware Developer based in New York City. I have completed by Bachelor in Electronic Engineering and my Double master’s in Computer and Electrical Engineering. I Develop Python Based Cross Platform Desktop Application , Webpages , Software, REST API, Database and much more I have more than 2 Years of Experience in Python

Import all Modules
In [21]:
import numpy as np
import os
import tensorflow as tf
import pandas as pd
import seaborn as sns 
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout

%matplotlib inline
Using TensorFlow backend.
In [4]:
df = pd.read_csv("Google_Stock_Price_Train.csv")
print(df.head(3))
print("\n")
print(df.isnull().sum())
print("\n")
print(df.info())
print("\n")
print(df.describe())
       Date    Open    High     Low   Close     Volume
0  1/3/2012  325.25  332.83  324.97  663.59  7,380,500
1  1/4/2012  331.27  333.87  329.08  666.45  5,749,400
2  1/5/2012  329.83  330.75  326.89  657.21  6,590,300


Date      0
Open      0
High      0
Low       0
Close     0
Volume    0
dtype: int64


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1258 entries, 0 to 1257
Data columns (total 6 columns):
Date      1258 non-null object
Open      1258 non-null float64
High      1258 non-null float64
Low       1258 non-null float64
Close     1258 non-null object
Volume    1258 non-null object
dtypes: float64(3), object(3)
memory usage: 59.0+ KB
None


              Open         High          Low
count  1258.000000  1258.000000  1258.000000
mean    533.709833   537.880223   529.007409
std     151.904442   153.008811   150.552807
min     279.120000   281.210000   277.220000
25%     404.115000   406.765000   401.765000
50%     537.470000   540.750000   532.990000
75%     654.922500   662.587500   644.800000
max     816.680000   816.680000   805.140000
Import the training Set
In [9]:
Training_Set = df.iloc[:,1:2]
In [12]:
print(Training_Set.head(2))
print("\n")
print(Training_Set.shape)
     Open
0  325.25
1  331.27


(1258, 1)
convert into Numpy Array
In [13]:
Training_Set = Training_Set.values
Normalize The data Set
In [14]:
sc = MinMaxScaler(feature_range=(0, 1))
Train = sc.fit_transform(Training_Set)

we will look at 60 time stamp back to predict the Future

In [18]:
X_Train = []
Y_Train = []

# Range should be fromm 60 Values to END 
for i in range(60, Train.shape[0]):
    
    # X_Train 0-59 
    X_Train.append(Train[i-60:i,0])
    
    # Y Would be 60 th Value based on past 60 Values 
    Y_Train.append(Train[i,0])

# Convert into Numpy Array
X_Train = np.array(X_Train)
Y_Train = np.array(Y_Train)

print(X_Train.shape)
print(Y_Train.shape)
(1198, 60)
(1198,)

Applying Reshaping Function

In [20]:
# Shape should be Number of [Datapoints , Steps , 1 )
# we convert into 3-d Vector or #rd Dimesnsion
X_Train = np.reshape(X_Train, newshape=(X_Train.shape[0], X_Train.shape[1], 1))

Model

In [22]:
regressor = Sequential()

# Adding the first LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_Train.shape[1], 1)))
regressor.add(Dropout(0.2))

# Adding a second LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a third LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a fourth LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))

# Adding the output layer
regressor.add(Dense(units = 1))

# Compiling the RNN
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')
WARNING:tensorflow:From /anaconda3/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
In [23]:
regressor.fit(X_Train, Y_Train, epochs = 60, batch_size = 32)
WARNING:tensorflow:From /anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Epoch 1/60
1198/1198 [==============================] - 13s 10ms/step - loss: 0.0411
Epoch 2/60
1198/1198 [==============================] - 9s 7ms/step - loss: 0.0062
Epoch 3/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0057
Epoch 4/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0057
Epoch 5/60
1198/1198 [==============================] - 11s 9ms/step - loss: 0.0048
Epoch 6/60
1198/1198 [==============================] - 12s 10ms/step - loss: 0.0044
Epoch 7/60
1198/1198 [==============================] - 12s 10ms/step - loss: 0.0041
Epoch 8/60
1198/1198 [==============================] - 10s 9ms/step - loss: 0.0046
Epoch 9/60
1198/1198 [==============================] - 10s 9ms/step - loss: 0.0043
Epoch 10/60
1198/1198 [==============================] - 13s 11ms/step - loss: 0.0043
Epoch 11/60
1198/1198 [==============================] - 11s 9ms/step - loss: 0.0039
Epoch 12/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0039
Epoch 13/60
1198/1198 [==============================] - 10s 9ms/step - loss: 0.0040
Epoch 14/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0037
Epoch 15/60
1198/1198 [==============================] - 11s 9ms/step - loss: 0.0038
Epoch 16/60
1198/1198 [==============================] - 11s 9ms/step - loss: 0.0037
Epoch 17/60
1198/1198 [==============================] - 11s 9ms/step - loss: 0.0036
Epoch 18/60
1198/1198 [==============================] - 14s 12ms/step - loss: 0.0033
Epoch 19/60
1198/1198 [==============================] - 13s 11ms/step - loss: 0.0037
Epoch 20/60
1198/1198 [==============================] - 20s 17ms/step - loss: 0.0035
Epoch 21/60
1198/1198 [==============================] - 19s 16ms/step - loss: 0.0032
Epoch 22/60
1198/1198 [==============================] - 12s 10ms/step - loss: 0.0039
Epoch 23/60
1198/1198 [==============================] - 17s 14ms/step - loss: 0.0036
Epoch 24/60
1198/1198 [==============================] - 14s 11ms/step - loss: 0.0033
Epoch 25/60
1198/1198 [==============================] - 14s 12ms/step - loss: 0.0036
Epoch 26/60
1198/1198 [==============================] - 13s 11ms/step - loss: 0.0033
Epoch 27/60
1198/1198 [==============================] - 10s 9ms/step - loss: 0.0032
Epoch 28/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0029
Epoch 29/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0032
Epoch 30/60
1198/1198 [==============================] - 11s 9ms/step - loss: 0.0031
Epoch 31/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0032
Epoch 32/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0029
Epoch 33/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0031
Epoch 34/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0026
Epoch 35/60
1198/1198 [==============================] - 9s 7ms/step - loss: 0.0028
Epoch 36/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0028
Epoch 37/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0026
Epoch 38/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0032
Epoch 39/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0027
Epoch 40/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0027
Epoch 41/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0031
Epoch 42/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0026
Epoch 43/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0024
Epoch 44/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0026
Epoch 45/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0025
Epoch 46/60
1198/1198 [==============================] - 10s 9ms/step - loss: 0.0023
Epoch 47/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0024
Epoch 48/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0025
Epoch 49/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0023
Epoch 50/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0024
Epoch 51/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0022
Epoch 52/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0021
Epoch 53/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0019
Epoch 54/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0024
Epoch 55/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0021
Epoch 56/60
1198/1198 [==============================] - 9s 8ms/step - loss: 0.0024
Epoch 57/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0022
Epoch 58/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0021
Epoch 59/60
1198/1198 [==============================] - 11s 9ms/step - loss: 0.0023
Epoch 60/60
1198/1198 [==============================] - 10s 8ms/step - loss: 0.0020
Out[23]:
<keras.callbacks.History at 0x1a380a9518>

Test

In [75]:
df1 = pd.read_csv('Google_Stock_Price_Test.csv')
df1.head(2)
Out[75]:
Date Open High Low Close Volume
0 1/3/2017 778.81 789.63 775.80 786.14 1,657,300
1 1/4/2017 788.36 791.34 783.16 786.90 1,073,000

combine Train and Test Dataset Combine into 1 DataFrame

In [34]:
Df_Total = pd.concat((df["Open"], df1["Open"]), axis=0)
In [36]:
Df_Total.shape
Out[36]:
(1278,)
In [40]:
# Getting the predicted stock price of 2017
# len(Df_Total)  ----- >. Total 1278 rows
# len(df1) ----> 20 Rows 
# result When we Subtract we get Original Dataset 
# We need Prevoius 60 to predict NEW so we do -60
# that would be our inputs 
inputs = Df_Total[len(Df_Total) - len(df1) - 60:].values

# We need to Reshape
inputs = inputs.reshape(-1,1)

# Normalize the Dataset
inputs = sc.transform(inputs)

X_test = []
for i in range(60, 80):
    X_test.append(inputs[i-60:i, 0])
    
# Convert into Numpy Array
X_test = np.array(X_test)

# Reshape before Passing to Network
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))

# Pass to Model 
predicted_stock_price = regressor.predict(X_test)

# Do inverse Transformation to get Values 
predicted_stock_price = sc.inverse_transform(predicted_stock_price)
In [63]:
# change the index to Date 
df1["Open"].plot()
plt.title("Actual Google Stock Price")
plt.grid(True)
plt.ylabel("Price in $")
Out[63]:
Text(0, 0.5, 'Price in $')
In [108]:
Test = pd.read_csv('Google_Stock_Price_Test.csv')

Prediction = pd.DataFrame(data={
    "Date":Test["Date"].to_list(),
    "Open":Test["Open"],
    "Network Predicted":[x[0] for x in predicted_stock_price ]
})
In [110]:
Prediction.plot()
Out[110]:
<matplotlib.axes._subplots.AxesSubplot at 0x1a43b8f080>
In [111]:
Prediction
Out[111]:
Date Open Network Predicted
0 1/3/2017 778.81 793.648376
1 1/4/2017 788.36 791.829956
2 1/5/2017 786.08 790.396301
3 1/6/2017 795.26 789.623230
4 1/9/2017 806.40 790.016052
5 1/10/2017 807.86 792.160889
6 1/11/2017 805.00 795.901611
7 1/12/2017 807.14 800.327271
8 1/13/2017 807.48 804.602051
9 1/17/2017 807.08 808.119019
10 1/18/2017 805.81 810.553406
11 1/19/2017 805.12 811.812073
12 1/20/2017 806.91 812.031860
13 1/23/2017 807.25 811.624084
14 1/24/2017 822.30 810.999146
15 1/25/2017 829.62 811.270142
16 1/26/2017 837.81 813.219055
17 1/27/2017 834.71 817.141113
18 1/30/2017 814.66 822.236023
19 1/31/2017 796.86 826.282776
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 

1 comment:

  1. very nicely describe thank you. we need to download the data set from kaggle.

    ReplyDelete

Learn How to configure your Spark Session to Join Managed (S3 Table Buckets) and Unmanaged Iceberg Tables | Hands on Labs

test-tble-bucket-joins Learn How to configure your Spark Session to Join Managed (S...