Generating an even random range in Netezza

Chris picture Chris · May 10, 2013 · Viewed 7.6k times · Source

Netezza says that its random() function generates a float between 0.0 and 0.99999... I'm trying to generate random ints within a range (e.g., between 5 and 10). The internet told me to use an equation like this to scale a fraction to a number in a range:

select f,count(*) from (
    select CAST(5 + random() * (10-5) as INT) as f
    from table_of_numbers
    where number between 1 and 5000
) x group by 1 order by 1

However, when I use that code, the extreme values are under-represented in the sample:

F    COUNT
5    486    <---
6    992
7    1057
8    1000
9    937
10   528    <---
Does anyone know how I can fix this?

Thanks!

Answer

Gordon Linoff picture Gordon Linoff · May 10, 2013

The expression random()*(10-5) produces numbers from 0 to 4.99999. However, you have a range of 6 values (5, 6, 7, 8, 9, and 10). So your expression is splitting 5 values across 6 buckets.

Your code is finding that the first and last are half full. Apparently, the cast() operation is rounding the values rather than truncating them (I don't think this is ANSI SQL, but it would explain your observed results). This masks the problem.

Try this:

select CAST(4.5 + random() * (10-5+1) as INT) 

In other databases, something like this should work:

select CAST(5 + random() * (10-5+1) as INT)