list comprehension replace for loop in 2D matrix

rankthefirst picture rankthefirst · Aug 17, 2014 · Viewed 46.6k times · Source

I try to use list comprehension to replace the for loop.

original file is

2 3 4 5 6 3
1 2 2 4 5 5
1 2 2 2 2 4

for loop

line_number = 0
for line in file:
    line_data = line.split()
    Cordi[line_number, :5] = line_data 
    line_number += 1

output is

[[2 3 4 5 6 3]
 [1 2 2 4 5 5]
 [1 2 2 2 2 4]]

if use list comprehension instead, for what I can think of is (I have to change the data type to int, so it can be plotted in later part of the program)

Cordi1= [int(x) for x in line.split() for line in data]

but the output is

[1, 1, 1]

but line.split() for line in data is actually a list, and if I try

Cordi1 = [int(x) for x in name of the list]

it works, why this happens?

Answer

Martijn Pieters picture Martijn Pieters · Aug 17, 2014

You have the order of your loops swapped; they should be ordered in the same way they would be nested, from left to right:

[int(x) for line in data for x in line.split()]

This loops over data first, then for each line iteration, iterates over line.split() to produce x. You then produce one flat list of integers from these.

However, since you are trying to build a list of lists, you need to nest a list comprehension inside another:

Cordi1 = [[int(i) for i in line.split()] for line in data]

Demo:

>>> data = '''\
... 2 3 4 5 6 3
... 1 2 2 4 5 5
... 1 2 2 2 2 4
... '''.splitlines()
>>> [int(x) for line in data for x in line.split()]
[2, 3, 4, 5, 6, 3, 1, 2, 2, 4, 5, 5, 1, 2, 2, 2, 2, 4]
>>> [[int(i) for i in line.split()] for line in data]
[[2, 3, 4, 5, 6, 3], [1, 2, 2, 4, 5, 5], [1, 2, 2, 2, 2, 4]]

If you wanted a multidimensional numpy array from this, you can either convert the above directly to an array or create an array from the data then reshape:

>>> import numpy as np
>>> np.array([[int(i) for i in line.split()] for line in data])
array([[2, 3, 4, 5, 6, 3],
       [1, 2, 2, 4, 5, 5],
       [1, 2, 2, 2, 2, 4]])
>>> np.array([int(i) for line in data for i in line.split()]).reshape((3, 6))
array([[2, 3, 4, 5, 6, 3],
       [1, 2, 2, 4, 5, 5],
       [1, 2, 2, 2, 2, 4]])