Slowly transitioning from Matlab to Python...
I have this list of the form
list1 = [[1, 2, nan], [3, 7, 8], [1, 1, 1], [10, -1, nan]]
and another list with the same number of items
list2 = [1, 2, 3, 4]
I'm trying to extract the elements of list1 not containing any nan values, and the corresponding elements in list2 i.e. the result should be:
list1_clean = [[3, 7, 8], [1, 1, 1]]
list2_clean = [2, 3]
In Matlab this is easily done with logical indexing.
Here I get the feeling a list comprehension of some form will do the trick, but I'm stuck at:
list1_clean = [x for x in list1 if not any(isnan(x))]
which obviously is of no use for list2.
Alternatively, the following attempt at logical indexing does not work ("indices must be integers, not lists")
idx = [any(isnan(x)) for x in list1]
list1_clean = list1[idx]
list2_clean = list2[idx]
I'm certain it's painfully trivial, but I can't figure it out, help appreciated !
You can use zip
.
zip
returns the items on the same index from the iterables passed to it.
>>> from math import isnan
>>> list1 = [[1, 2, 'nan'], [3, 7, 8], [1, 1, 1], [10, -1,'nan']]
>>> list2 = [1, 2, 3, 4]
>>> out = [(x,y) for x,y in zip(list1,list2)
if not any(isnan(float(z)) for z in x)]
>>> out
[([3, 7, 8], 2), ([1, 1, 1], 3)]
Now unzip out
to get the required output:
>>> list1_clean, list2_clean = map(list, zip(*out))
>>> list1_clean
[[3, 7, 8], [1, 1, 1]]
>>> list2_clean
[2, 3]
help on zip
:
>>> print zip.__doc__
zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]
Return a list of tuples, where each tuple contains the i-th element
from each of the argument sequences. The returned list is truncated
in length to the length of the shortest argument sequence.
You can use itertools.izip
if you want a memory efficient solution as it returns an iterator.