Numpy shuffle multidimensional array by row only, keep column order unchanged

robert picture robert · Feb 26, 2016 · Viewed 88.3k times · Source

How can I shuffle a multidimensional array by row only in Python (so do not shuffle the columns).

I am looking for the most efficient solution, because my matrix is very huge. Is it also possible to do this highly efficient on the original array (to save memory)?

Example:

import numpy as np
X = np.random.random((6, 2))
print(X)
Y = ???shuffle by row only not colls???
print(Y)

What I expect now is original matrix:

[[ 0.48252164  0.12013048]
 [ 0.77254355  0.74382174]
 [ 0.45174186  0.8782033 ]
 [ 0.75623083  0.71763107]
 [ 0.26809253  0.75144034]
 [ 0.23442518  0.39031414]]

Output shuffle the rows not cols e.g.:

[[ 0.45174186  0.8782033 ]
 [ 0.48252164  0.12013048]
 [ 0.77254355  0.74382174]
 [ 0.75623083  0.71763107]
 [ 0.23442518  0.39031414]
 [ 0.26809253  0.75144034]]

Answer

Kasravnd picture Kasravnd · Feb 26, 2016

That's what numpy.random.shuffle() is for :

>>> X = np.random.random((6, 2))
>>> X
array([[ 0.9818058 ,  0.67513579],
       [ 0.82312674,  0.82768118],
       [ 0.29468324,  0.59305925],
       [ 0.25731731,  0.16676408],
       [ 0.27402974,  0.55215778],
       [ 0.44323485,  0.78779887]])

>>> np.random.shuffle(X)
>>> X
array([[ 0.9818058 ,  0.67513579],
       [ 0.44323485,  0.78779887],
       [ 0.82312674,  0.82768118],
       [ 0.29468324,  0.59305925],
       [ 0.25731731,  0.16676408],
       [ 0.27402974,  0.55215778]])