How to generate list of unique random floats in Python

Simon picture Simon · Jul 30, 2017 · Viewed 9.9k times · Source

I know that there are easy ways to generate lists of unique random integers (e.g. random.sample(range(1, 100), 10)).

I wonder whether there is some better way of generating a list of unique random floats, apart from writing a function that acts like a range, but accepts floats like this:

import random

def float_range(start, stop, step):
    vals = []
    i = 0
    current_val = start
    while current_val < stop:
        vals.append(current_val)
        i += 1
        current_val = start + i * step
    return vals

unique_floats = random.sample(float_range(0, 2, 0.2), 3)

Is there a better way to do this?

Answer

Raymond Hettinger picture Raymond Hettinger · Jul 30, 2017

Answer

One easy way is to keep a set of all random values seen so far and reselect if there is a repeat:

import random

def sample_floats(low, high, k=1):
    """ Return a k-length list of unique random floats
        in the range of low <= x <= high
    """
    result = []
    seen = set()
    for i in range(k):
        x = random.uniform(low, high)
        while x in seen:
            x = random.uniform(low, high)
        seen.add(x)
        result.append(x)
    return result

Notes

  • This technique is how Python's own random.sample() is implemented.

  • The function uses a set to track previous selections because searching a set is O(1) while searching a list is O(n).

  • Computing the probability of a duplicate selection is equivalent to the famous Birthday Problem.

  • Given 2**53 distinct possible values from random(), duplicates are infrequent. On average, you can expect a duplicate float at about 120,000,000 samples.

Variant: Limited float range

If the population is limited to just a range of evenly spaced floats, then it is possible to use random.sample() directly. The only requirement is that the population be a Sequence:

from __future__ import division
from collections import Sequence

class FRange(Sequence):
    """ Lazily evaluated floating point range of evenly spaced floats
        (inclusive at both ends)

        >>> list(FRange(low=10, high=20, num_points=5))
        [10.0, 12.5, 15.0, 17.5, 20.0]

    """
    def __init__(self, low, high, num_points):
        self.low = low
        self.high = high
        self.num_points = num_points

    def __len__(self):
        return self.num_points

    def __getitem__(self, index):
        if index < 0:
            index += len(self)
        if index < 0 or index >= len(self):
            raise IndexError('Out of range')
        p = index / (self.num_points - 1)
        return self.low * (1.0 - p) + self.high * p

Here is a example of choosing ten random samples without replacement from a range of 41 evenly spaced floats from 10.0 to 20.0.

>>> import random
>>> random.sample(FRange(low=10.0, high=20.0, num_points=41), k=10)
[13.25, 12.0, 15.25, 18.5, 19.75, 12.25, 15.75, 18.75, 13.0, 17.75]