I know that there are easy ways to generate lists of unique random integers (e.g. random.sample(range(1, 100), 10)
).
I wonder whether there is some better way of generating a list of unique random floats, apart from writing a function that acts like a range, but accepts floats like this:
import random
def float_range(start, stop, step):
vals = []
i = 0
current_val = start
while current_val < stop:
vals.append(current_val)
i += 1
current_val = start + i * step
return vals
unique_floats = random.sample(float_range(0, 2, 0.2), 3)
Is there a better way to do this?
One easy way is to keep a set of all random values seen so far and reselect if there is a repeat:
import random
def sample_floats(low, high, k=1):
""" Return a k-length list of unique random floats
in the range of low <= x <= high
"""
result = []
seen = set()
for i in range(k):
x = random.uniform(low, high)
while x in seen:
x = random.uniform(low, high)
seen.add(x)
result.append(x)
return result
This technique is how Python's own random.sample() is implemented.
The function uses a set to track previous selections because searching a set is O(1) while searching a list is O(n).
Computing the probability of a duplicate selection is equivalent to the famous Birthday Problem.
Given 2**53 distinct possible values from random(), duplicates are infrequent. On average, you can expect a duplicate float at about 120,000,000 samples.
If the population is limited to just a range of evenly spaced floats, then it is possible to use random.sample() directly. The only requirement is that the population be a Sequence:
from __future__ import division
from collections import Sequence
class FRange(Sequence):
""" Lazily evaluated floating point range of evenly spaced floats
(inclusive at both ends)
>>> list(FRange(low=10, high=20, num_points=5))
[10.0, 12.5, 15.0, 17.5, 20.0]
"""
def __init__(self, low, high, num_points):
self.low = low
self.high = high
self.num_points = num_points
def __len__(self):
return self.num_points
def __getitem__(self, index):
if index < 0:
index += len(self)
if index < 0 or index >= len(self):
raise IndexError('Out of range')
p = index / (self.num_points - 1)
return self.low * (1.0 - p) + self.high * p
Here is a example of choosing ten random samples without replacement from a range of 41 evenly spaced floats from 10.0 to 20.0.
>>> import random
>>> random.sample(FRange(low=10.0, high=20.0, num_points=41), k=10)
[13.25, 12.0, 15.25, 18.5, 19.75, 12.25, 15.75, 18.75, 13.0, 17.75]