PyTorch softmax with dim

blue-sky picture blue-sky · Sep 26, 2018 · Viewed 12.4k times · Source

Which dimension should softmax be applied to ?

This code :

%reset -f

import torch.nn as nn
import numpy as np
import torch

my_softmax = nn.Softmax(dim=-1)

mu, sigma = 0, 0.1 # mean and standard deviation

train_dataset = []
image = []
image_x = np.random.normal(mu, sigma, 24).reshape((3 , 4, 2))
train_dataset.append(image_x)

x = torch.tensor(train_dataset).float()

print(x)
print(my_softmax(x))
my_softmax = nn.Softmax(dim=1)
print(my_softmax(x))

prints following :

tensor([[[[-0.1500,  0.0243],
          [ 0.0226,  0.0772],
          [-0.0180, -0.0278],
          [ 0.0782, -0.0853]],

         [[-0.0134, -0.1139],
          [ 0.0385, -0.1367],
          [-0.0447,  0.1493],
          [-0.0633, -0.2964]],

         [[ 0.0123,  0.0061],
          [ 0.1086, -0.0049],
          [-0.0918, -0.1308],
          [-0.0100,  0.1730]]]])
tensor([[[[ 0.4565,  0.5435],
          [ 0.4864,  0.5136],
          [ 0.5025,  0.4975],
          [ 0.5408,  0.4592]],

         [[ 0.5251,  0.4749],
          [ 0.5437,  0.4563],
          [ 0.4517,  0.5483],
          [ 0.5580,  0.4420]],

         [[ 0.5016,  0.4984],
          [ 0.5284,  0.4716],
          [ 0.5098,  0.4902],
          [ 0.4544,  0.5456]]]])
tensor([[[[ 0.3010,  0.3505],
          [ 0.3220,  0.3665],
          [ 0.3445,  0.3230],
          [ 0.3592,  0.3221]],

         [[ 0.3450,  0.3053],
          [ 0.3271,  0.2959],
          [ 0.3355,  0.3856],
          [ 0.3118,  0.2608]],

         [[ 0.3540,  0.3442],
          [ 0.3509,  0.3376],
          [ 0.3200,  0.2914],
          [ 0.3289,  0.4171]]]])

So first tensor is prior to softmax being applied, second tensor is result of softmax applied to tensor with dim=-1 and third tensor is result of softmax applied to tensor with dim=1 .

For result of first softmax can see corresponding elements sum to 1, for example [ 0.4565, 0.5435] -> 0.4565 + 0.5435 == 1.

What is summing to 1 as result of of second softmax ?

Which dim value should I choose ?

Update : The dimension (3 , 4, 2) corresponds to image dimension where 3 is the RGB value , 4 is the number of horizontal pixels (width) , 2 is the number of vertical pixels (height). This is an image classification problem. I'm using cross entropy loss function. Also, I'm using softmax in final layer in order to back-propagate probabilities.

Answer

unlut picture unlut · Sep 26, 2018

You have a 1x3x4x2 tensor train_dataset. Your softmax function's dim parameter determines across which dimension to perform Softmax operation. First dimension is your batch dimension, second is depth, third is rows and last one is columns. Please look at picture below (sorry for horrible drawing) to understand how softmax is performed when you specify dim as 1. enter image description here

In short, sum of each corresponding entry of your 4x2 matrices are equal to 1.

Update: The question which dimension the softmax should be applied depends on what data your tensor store, and what is your goal.

Update: For image classification task, please see the tutorial on official pytorch website. It covers basics of image classification with pytorch on a real dataset and its a very short tutorial. Although that tutorial does not perform Softmax operation, what you need to do is just use torch.nn.functional.log_softmax on output of last fully connected layer. See MNIST classifier with pytorch for a complete example. It does not matter whether your image is RGB or grayscale after flattening it for fully connected layers (also keep in mind that same code for MNIST example might not work for you, depends on which pytorch version you use).