How to incrementally sample without replacement?

necromancer picture necromancer · Sep 20, 2013 · Viewed 20.3k times · Source

Python has my_sample = random.sample(range(100), 10) to randomly sample without replacement from [0, 100).

Suppose I have sampled n such numbers and now I want to sample one more without replacement (without including any of the previously sampled n), how to do so super efficiently?

update: changed from "reasonably efficiently" to "super efficiently" (but ignoring constant factors)

Answer

Tim Peters picture Tim Peters · Sep 20, 2013

If you know in advance that you're going to want to multiple samples without overlaps, easiest is to do random.shuffle() on list(range(100)) (Python 3 - can skip the list() in Python 2), then peel off slices as needed.

s = list(range(100))
random.shuffle(s)
first_sample = s[-10:]
del s[-10:]
second_sample = s[-10:]
del s[-10:]
# etc

Else @Chronial's answer is reasonably efficient.