How can I improve the classification accuracy of LSTM,GRU recurrent neural networks

OJJ picture OJJ · Jul 10, 2017 · Viewed 7.8k times · Source

Binary Classification Problem in Tensorflow:

I have gone through the online tutorials and trying to apply it on a real-time problem using gated-recurrent unit (GRU). I have tried all the possibilities which I know to improve the classification.

1) Started adding stacked RNN(GRU) layers 2) Increasing hidden units per RNN layer 3) Added "sigmoid" and "RelU" activation functions for hidden layer 4) Normalized the input data 5) Changed the hyperparameters

Please find the link to the dataset: https://github.com/madhurilalitha/Myfirstproject/blob/master/ApplicationLayerTrainingData1.txt

If you can go through the dataset, it has labels "normal" and "other than normal". I have encoded "normal" as '1 0' and abnormal as '0 1'. I have also changed the dataset in to 3D of the shape below:

Shape of new train X (7995, 5, 40) Shape of new train Y (7995, 2) Shape of new test X (1994, 5, 40) Shape of new test Y (1994, 2)

I am not pretty sure where I am missing the logic, Could someone help me in finding the fault in my code?

The classification accuracy on the test data is 52.3%. It performs with same accuracy even on training data. Please find the code below:

#Hyper Parameters for the model
learning_rate = 0.001   
n_classes = 2    
display_step = 100    
input_features = train_X.shape[1] #No of selected features(columns)    
training_cycles = 1000    
time_steps = 5 # No of time-steps to backpropogate    
hidden_units = 50 #No of GRU units in a GRU Hidden Layer   

#Input Placeholders
with tf.name_scope('input'):
    x = tf.placeholder(tf.float64,shape = [None,time_steps,input_features], name 
= "x-input")    
    y = tf.placeholder(tf.float64, shape = [None,n_classes],name = "y-input")
#Weights and Biases    
with tf.name_scope("weights"):
    W = tf.Variable(tf.random_normal([hidden_units,n_classes]),name = "layer-
weights")    

with tf.name_scope("biases"):
    b = tf.Variable(tf.random_normal([n_classes]),name = "unit-biases")     


# Unstack to get a list of 'time_steps' tensors of shape (batch_size, 
input_features)
x_ = tf.unstack(x,time_steps,axis =1)    

#Defining a single GRU cell
gru_cell = tf.contrib.rnn.GRUCell(hidden_units)    

#GRU Output
with tf.variable_scope('MyGRUCel36'):   
    gruoutputs,grustates = 
tf.contrib.rnn.static_rnn(gru_cell,x_,dtype=tf.float64)    

#Linear Activation , using gru inner loop last output
output = tf.add(tf.matmul(gruoutputs[-1],tf.cast(W,tf.float64)),tf.cast(b,tf.float64))

#Defining the loss function
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits = output))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

#Training the Model
sess = tf.InteractiveSession()    
sess.run(tf.global_variables_initializer())    
for i in range (training_cycles):   
    _,c = sess.run([optimizer,cost], feed_dict = {x:newtrain_X, y:newtrain_Y})

    if (i) % display_step == 0:
        print ("Cost for the training cycle : ",i," : is : ",sess.run(cost, feed_dict ={x :newtrain_X,y:newtrain_Y}))
correct = tf.equal(tf.argmax(output, 1), tf.argmax(y,1))    
accuracy = tf.reduce_mean(tf.cast(correct, 'float'))    
print('Accuracy on the overall test set is :',accuracy.eval({x:newtest_X, y:newtest_Y}))    

Answer

finbarr picture finbarr · Jul 10, 2017

It sounds like you're on the right track. I would try visualizing your training data to make sure it's decreasing as you expect.

Is there a reason that you think you should be getting higher accuracy? That could just be the best you can do with this amount of data. One of the best ways to increase your model performance is to get more data; is it possible to get more data?

Hyperparameter optimization is a good way to proceed; I would try playing with different learning rates, different numbers of hidden layers, and different sizes of hidden layers.