how to create confusion matrix for classification in tensorflow

Raady picture Raady · Mar 27, 2017 · Viewed 30.1k times · Source

I have CNN model which has 4 output nodes, and I am trying to compute the confusion matrix so that i can know the individual class accuracy. I am able to compute the overall accuracy. In the link here, Igor Valantic gave a function which can compute the confusion matrix variables. it gives me an error at correct_prediction = tf.nn.in_top_k(logits, labels, 1, name="correct_answers") and the error is TypeError: DataType float32 for attr 'T' not in list of allowed values: int32, int64

I have tried typecasting logits to int32 inside function mentioned def evaluation(logits, labels), it gives another error at computing correct_prediction = ... as TypeError:Input 'predictions' of 'InTopK' Op has type int32 that does not match expected type of float32

how to calculate this confusion matrix ?

sess = tf.Session()
model = dimensions() # CNN input weights are calculated 
data_train, data_test, label_train, label_test =  load_data(files_test2,folder)
data_train, data_test, = reshapedata(data_train, data_test, model)
# input output placeholders
x  = tf.placeholder(tf.float32, [model.BATCH_SIZE, model.input_width,model.input_height,model.input_depth]) # last column = 1 
y_ = tf.placeholder(tf.float32, [model.BATCH_SIZE, model.No_Classes])
p_keep_conv = tf.placeholder("float")
# 
y  = mycnn(x,model, p_keep_conv)
# loss
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
# train step
train_step = tf.train.AdamOptimizer(1e-3).minimize(cost)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
true_positives, false_positives, true_negatives, false_negatives = evaluation(y,y_)
lossfun = np.zeros(STEPS)
sess.run(tf.global_variables_initializer())

for i in range(STEPS):
    image_batch, label_batch = batchdata(data_train, label_train, model.BATCH_SIZE)
    epoch_loss = 0
    for j in range(model.BATCH_SIZE):
        sess.run(train_step, feed_dict={x: image_batch, y_: label_batch, p_keep_conv:1.0})
        c = sess.run( cost, feed_dict={x: image_batch, y_: label_batch, p_keep_conv: 1.0})
        epoch_loss += c
    lossfun[i] = epoch_loss
    print('Epoch',i,'completed out of',STEPS,'loss:',epoch_loss )
 TP,FP,TN,FN = sess.run([true_positives, false_positives, true_negatives,  false_negatives], feed_dict={x: image_batch, y_: label_batch, p_keep_conv:1.0})

this is my code snippet

Answer

vega picture vega · Mar 27, 2017

You can simply use Tensorflow's confusion matrix. I assume y are your predictions, and you may or may not have num_classes (which is optional)

y_ = placeholder_for_labels # for eg: [1, 2, 4]
y = mycnn(...) # for eg: [2, 2, 4]

confusion = tf.confusion_matrix(labels=y_, predictions=y, num_classes=num_classes)

If you print(confusion), you get

  [[0 0 0 0 0]
   [0 0 1 0 0]
   [0 0 1 0 0]
   [0 0 0 0 0]
   [0 0 0 0 1]]

If print(confusion) is not printing the confusion matrix, then use print(confusion.eval(session=sess)). Here sess is the name of your TensorFlow session.