Python generator expression if-else

Nupur picture Nupur · Aug 24, 2012 · Viewed 10.3k times · Source

I am using Python to parse a large file. What I want to do is

If condition =True
   append to list A
else 
   append to list B

I want to use generator expressions for this - to save memory. I am putting in the actual code.

def is_low_qual(read):
    lowqual_bp=(bq for bq in phred_quals(read) if bq < qual_threshold)  
    if iter_length(lowqual_bp) >  num_allowed:
        return True
    else:
        return False  

lowqual=(read for read in SeqIO.parse(r_file,"fastq") if is_low_qual(read)==True)
highqual=(read for read in SeqIO.parse(r_file,"fastq") if is_low_qual(read)==False)


SeqIO.write(highqual,flt_out_handle,"fastq")
SeqIO.write(lowqual,junk_out_handle,"fastq")

def iter_length(the_gen):
    return sum(1 for i in the_gen)

Answer

nneonneo picture nneonneo · Aug 24, 2012

You can use itertools.tee in conjunction with itertools.ifilter and itertools.ifilterfalse:

import itertools
def is_condition_true(x):
    ...

gen1, gen2 = itertools.tee(sequences)
low = itertools.ifilter(is_condition_true, gen1)
high = itertools.ifilterfalse(is_condition_true, gen2)

Using tee ensures that the function works correctly even if sequences is itself a generator.

Note, though, that tee could itself use a fair bit of memory (up to a list of size len(sequences)) if low and high are consumed at different rates (e.g. if low is exhausted before high is used).