[I'm splitting a population number into different matrices and want to test my code using random numbers for now.]
Quick question guys and thanks for your help in advance -
If I use;
100*rand(9,1)
What is the best way to make these 9 numbers add to 100?
I'd like 9 random numbers between 0 and 100 that add up to 100.
Is there an inbuilt command that does this because I can't seem to find it.
I see the mistake so often, the suggestion that to generate random numbers with a given sum, one just uses a uniform random set, and just scale them. But is the result truly uniformly random if you do it that way?
Try this simple test in two dimensions. Generate a huge random sample, then scale them to sum to 1. I'll use bsxfun to do the scaling.
xy = rand(10000000,2);
xy = bsxfun(@times,xy,1./sum(xy,2));
hist(xy(:,1),100)
If they were truly uniformly random, then the x coordinate would be uniform, as would the y coordinate. Any value would be equally likely to happen. In effect, for two points to sum to 1 they must lie along the line that connects the two points (0,1), (1,0) in the (x,y) plane. For the points to be uniform, any point along that line must be equally likely.
Clearly uniformity fails when I use the scaling solution. Any point on that line is NOT equally likely. We can see the same thing happening in 3-dimensions. See that in the 3-d figure here, the points in the center of the triangular region are more densely packed. This is a reflection of non-uniformity.
xyz = rand(10000,3);
xyz = bsxfun(@times,xyz,1./sum(xyz,2));
plot3(xyz(:,1),xyz(:,2),xyz(:,3),'.')
view(70,35)
box on
grid on
Again, the simple scaling solution fails. It simply does NOT produce truly uniform results over the domain of interest.
Can we do better? Well, yes. A simple solution in 2-d is to generate a single random number that designates the distance along the line connecting the points (0,1) and 1,0).
t = rand(10000000,1);
xy = t*[0 1] + (1-t)*[1 0];
hist(xy(:,1),100)
It can be shown that ANY point along the line defined by the equation x+y = 1, in the unit square, is now equally likely to have been chosen. This is reflected by the nice, flat histogram.
Does the sort trick suggested by David Schwartz work in n-dimensions? Clearly it does so in 2-d, and the figure below suggests that it does so in 3-dimensions. Without deep thought on the matter, I believe that it will work for this basic case in question, in n-dimensions.
n = 10000;
uv = [zeros(n,1),sort(rand(n,2),2),ones(n,1)];
xyz = diff(uv,[],2);
plot3(xyz(:,1),xyz(:,2),xyz(:,3),'.')
box on
grid on
view(70,35)
One can also download the function randfixedsum from the file exchange, Roger Stafford's contribution. This is a more general solution to generate truly uniform random sets in the unit hyper-cube, with any given fixed sum. Thus, to generate random sets of points that lie in the unit 3-cube, subject to the constraint they sum to 1.25...
xyz = randfixedsum(3,10000,1.25,0,1)';
plot3(xyz(:,1),xyz(:,2),xyz(:,3),'.')
view(70,35)
box on
grid on