How to loop through IEnumerable in batches

user1526912 picture user1526912 · Mar 14, 2013 · Viewed 63.8k times · Source

I am developing a c# program which has an "IEnumerable users" that stores the ids of 4 million users. I need to loop through the Ienummerable and extract a batch 1000 ids each time to perform some operations in another method.

How do I extract 1000 ids at a time from start of the Ienumerable ...do some thing else then fetch the next batch of 1000 and so on ?

Is this possible?

Answer

Sergey Berezovskiy picture Sergey Berezovskiy · Mar 14, 2013

You can use MoreLINQ's Batch operator (available from NuGet):

foreach(IEnumerable<User> batch in users.Batch(1000))
   // use batch

If simple usage of library is not an option, you can reuse implementation:

public static IEnumerable<IEnumerable<T>> Batch<T>(
        this IEnumerable<T> source, int size)
{
    T[] bucket = null;
    var count = 0;

    foreach (var item in source)
    {
       if (bucket == null)
           bucket = new T[size];

       bucket[count++] = item;

       if (count != size)                
          continue;

       yield return bucket.Select(x => x);

       bucket = null;
       count = 0;
    }

    // Return the last bucket with all remaining elements
    if (bucket != null && count > 0)
    {
        Array.Resize(ref bucket, count);
        yield return bucket.Select(x => x);
    }
}

BTW for performance you can simply return bucket without calling Select(x => x). Select is optimized for arrays, but selector delegate still would be invoked on each item. So, in your case it's better to use

yield return bucket;