How can I limit iterations of a loop in Python?

Aaron Hall picture Aaron Hall · Mar 19, 2016 · Viewed 72.9k times · Source

Say I have a list of items, and I want to iterate over the first few of it:

items = list(range(10)) # I mean this to represent any kind of iterable.
limit = 5

Naive implementation

The Python naïf coming from other languages would probably write this perfectly serviceable and performant (if unidiomatic) code:

index = 0
for item in items: # Python's `for` loop is a for-each.
    print(item)    # or whatever function of that item.
    index += 1
    if index == limit:
        break

More idiomatic implementation

But Python has enumerate, which subsumes about half of that code nicely:

for index, item in enumerate(items):
    print(item)
    if index == limit: # There's gotta be a better way.
        break

So we've about cut the extra code in half. But there's gotta be a better way.

Can we approximate the below pseudocode behavior?

If enumerate took another optional stop argument (for example, it takes a start argument like this: enumerate(items, start=1)) that would, I think, be ideal, but the below doesn't exist (see the documentation on enumerate here):

# hypothetical code, not implemented:
for _, item in enumerate(items, start=0, stop=limit): # `stop` not implemented
    print(item)

Note that there would be no need to name the index because there is no need to reference it.

Is there an idiomatic way to write the above? How?

A secondary question: why isn't this built into enumerate?

Answer

Aaron Hall picture Aaron Hall · Mar 19, 2016

How can I limit iterations of a loop in Python?

for index, item in enumerate(items):
    print(item)
    if index == limit:
        break

Is there a shorter, idiomatic way to write the above? How?

Including the index

zip stops on the shortest iterable of its arguments. (In contrast with the behavior of zip_longest, which uses the longest iterable.)

range can provide a limited iterable that we can pass to zip along with our primary iterable.

So we can pass a range object (with its stop argument) to zip and use it like a limited enumerate.

zip(range(limit), items)

Using Python 3, zip and range return iterables, which pipeline the data instead of materializing the data in lists for intermediate steps.

for index, item in zip(range(limit), items):
    print(index, item)

To get the same behavior in Python 2, just substitute xrange for range and itertools.izip for zip.

from itertools import izip
for index, item in izip(xrange(limit), items):
    print(item)

If not requiring the index, itertools.islice

You can use itertools.islice:

for item in itertools.islice(items, 0, stop):
    print(item)

which doesn't require assigning to the index.

Composing enumerate(islice(items, stop)) to get the index

As Pablo Ruiz Ruiz points out, we can also compose islice with enumerate.

for index, item in enumerate(islice(items, limit)):
    print(index, item)

Why isn't this built into enumerate?

Here's enumerate implemented in pure Python (with possible modifications to get the desired behavior in comments):

def enumerate(collection, start=0):  # could add stop=None
    i = start
    it = iter(collection)
    while 1:                         # could modify to `while i != stop:`
        yield (i, next(it))
        i += 1

The above would be less performant for those using enumerate already, because it would have to check whether it is time to stop every iteration. We can just check and use the old enumerate if don't get a stop argument:

_enumerate = enumerate

def enumerate(collection, start=0, stop=None):
    if stop is not None:
        return zip(range(start, stop), collection)
    return _enumerate(collection, start)

This extra check would have a slight negligible performance impact.

As to why enumerate does not have a stop argument, this was originally proposed (see PEP 279):

This function was originally proposed with optional start and stop arguments. GvR [Guido van Rossum] pointed out that the function call enumerate(seqn, 4, 6) had an alternate, plausible interpretation as a slice that would return the fourth and fifth elements of the sequence. To avoid the ambiguity, the optional arguments were dropped even though it meant losing flexibility as a loop counter. That flexibility was most important for the common case of counting from one, as in:

for linenum, line in enumerate(source,1):  print linenum, line

So apparently start was kept because it was very valuable, and stop was dropped because it had fewer use-cases and contributed to confusion on the usage of the new function.

Avoid slicing with subscript notation

Another answer says:

Why not simply use

for item in items[:limit]: # or limit+1, depends

Here's a few downsides:

  • It only works for iterables that accept slicing, thus it is more limited.
  • If they do accept slicing, it usually creates a new data structure in memory, instead of iterating over the reference data structure, thus it wastes memory (All builtin objects make copies when sliced, but, for example, numpy arrays make a view when sliced).
  • Unsliceable iterables would require the other kind of handling. If you switch to a lazy evaluation model, you'll have to change the code with slicing as well.

You should only use slicing with subscript notation when you understand the limitations and whether it makes a copy or a view.

Conclusion

I would presume that now the Python community knows the usage of enumerate, the confusion costs would be outweighed by the value of the argument.

Until that time, you can use:

for index, element in zip(range(limit), items):
    ...

or

for index, item in enumerate(islice(items, limit)):
    ...

or, if you don't need the index at all:

for element in islice(items, 0, limit):
    ...

And avoid slicing with subscript notation, unless you understand the limitations.