Torch: How to shuffle a tensor by it's rows?

DaveTheAl picture DaveTheAl · Jun 24, 2017 · Viewed 12.4k times · Source

I am currently working in torch to implement a random shuffle (on the rows, the first dimension in this case) on some input data. I am new to torch, so I have some troubles figuring out how permutation works..

The following is supposed to shuffle the data:

if argshuffle then 
    local perm = torch.randperm(sids:size(1)):long()
    print("\n\n\nSize of X and y before")
    print(X:view(-1, 1000, 128):size())
    print(y:size())
    print(sids:size())
    print("\nPerm size is: ")
    print(perm:size())
    X = X:view(-1, 1000, 128)[{{perm},{},{}}]
    y = y[{{perm},{}}]
    print(sids[{{1}, {}}])
    sids = sids[{{perm},{}}]
    print(sids[{{1}, {}}])
    print(X:size())
    print(y:size())
    print(sids:size())
    os.exit(69)
end

This prints out

Size of X and y before 
99 
1000
128
[torch.LongStorage of size 3]

99 
1
[torch.LongStorage of size 2]

99 
1
[torch.LongStorage of size 2]

Perm size is: 
99 
[torch.LongStorage of size 1]
5
[torch.LongStorage of size 1x1]
5
[torch.LongStorage of size 1x1]


99 
1000
128
[torch.LongStorage of size 3]

99 
1
[torch.LongStorage of size 2]

99 
1
[torch.LongStorage of size 2]

Out of the value, I can imply that the function did not shuffle the data. How can I make it shuffle correctly, and what is the common solution in lua/torch?

Answer

11t picture 11t · Nov 13, 2018

I also faced a similar issue. In the documentation, there is no shuffle function for tensors (there are for dataset loaders). I found a workaround to the problem using torch.randperm.

>>> a=torch.rand(3,5)
>>> print(a)
tensor([[0.4896, 0.3708, 0.2183, 0.8157, 0.7861],
        [0.0845, 0.7596, 0.5231, 0.4861, 0.9237],
        [0.4496, 0.5980, 0.7473, 0.2005, 0.8990]])
>>> # Row shuffling
... 
>>> a=a[torch.randperm(a.size()[0])]
>>> print(a)
tensor([[0.4496, 0.5980, 0.7473, 0.2005, 0.8990],
        [0.0845, 0.7596, 0.5231, 0.4861, 0.9237],
        [0.4896, 0.3708, 0.2183, 0.8157, 0.7861]])
>>> # column shuffling
... 
>>> a=a[:,torch.randperm(a.size()[1])]
>>> print(a)
tensor([[0.2005, 0.7473, 0.5980, 0.8990, 0.4496],
        [0.4861, 0.5231, 0.7596, 0.9237, 0.0845],
        [0.8157, 0.2183, 0.3708, 0.7861, 0.4896]])

I hope it answers the question!