How to change the temperature of a softmax output in Keras

chasep255 picture chasep255 · May 16, 2016 · Viewed 9.3k times · Source

I am currently trying to reproduce the results of the following article.
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
I am using Keras with the theano backend. In the article he talks about controlling the temperature of the final softmax layer to give different outputs.

Temperature. We can also play with the temperature of the Softmax during sampling. Decreasing the temperature from 1 to some lower number (e.g. 0.5) makes the RNN more confident, but also more conservative in its samples. Conversely, higher temperatures will give more diversity but at cost of more mistakes (e.g. spelling mistakes, etc). In particular, setting temperature very near zero will give the most likely thing that Paul Graham might say:

My model is as follows.

model = Sequential()
model.add(LSTM(128, batch_input_shape = (batch_size, 1, 256), stateful = True, return_sequences = True))
model.add(LSTM(128, stateful = True))
model.add(Dropout(0.1))
model.add(Dense(256, activation = 'softmax'))

model.compile(optimizer = Adam(),
              loss = 'categorical_crossentropy', 
              metrics = ['accuracy'])

The only way I can think to adjust the temperature of the final Dense layer would be to get the weight matrix and multiply it by the temperature. Does anyone know of a better way to do it? Also if anyone sees anything wrong with how I setup the model let me know since I am new to RNNs.

Answer

chasep255 picture chasep255 · May 16, 2016

Well it looks like the temperature is something you do to the output of the softmax layer. I found this example.

https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py

He applies the following function to sample the soft-max output.

def sample(a, temperature=1.0):
    # helper function to sample an index from a probability array
    a = np.log(a) / temperature
    a = np.exp(a) / np.sum(np.exp(a))
    return np.argmax(np.random.multinomial(1, a, 1))