Algorithm For Ranking Items

Krzysztof Czelusniak picture Krzysztof Czelusniak · Jan 5, 2011 · Viewed 28k times · Source

I have a list of 6500 items that I would like to trade or invest in. (Not for real money, but for a certain game.) Each item has 5 numbers that will be used to rank it among the others.

Total quantity of item traded per day: The higher this number, the better.

The Donchian Channel of the item over the last 5 days: The higher this number, the better.

The median spread of the price: The lower this number, the better.

The spread of the 20 day moving average for the item: The lower this number, the better.

The spread of the 5 day moving average for the item: The higher this number, the better.

All 5 numbers have the same 'weight', or in other words, they should all affect the final number in the with the same worth or value.

At the moment, I just multiply all 5 numbers for each item, but it doesn't rank the items the way I would them to be ranked. I just want to combine all 5 numbers into a weighted number that I can use to rank all 6500 items, but I'm unsure of how to do this correctly or mathematically.

Note: The total quantity of the item traded per day and the donchian channel are numbers that are much higher then the spreads, which are more of percentage type numbers. This is probably the reason why multiplying them all together didn't work for me; the quantity traded per day and the donchian channel had a much bigger role in the final number.

Answer

Nick Fortescue picture Nick Fortescue · Jan 5, 2011

The reason people are having trouble answering this question is we have no way of comparing two different "attributes". If there were just two attributes, say quantity traded and median price spread, would (20million,50%) be worse or better than (100,1%)? Only you can decide this.

Converting everything into the same size numbers could help, this is what is known as "normalisation". A good way of doing this is the z-score which Prasad mentions. This is a statistical concept, looking at how the quantity varies. You need to make some assumptions about the statistical distributions of your numbers to use this.

Things like spreads are probably normally distributed - shaped like a normal distribution. For these, as Prasad says, take z(spread) = (spread-mean(spreads))/standardDeviation(spreads).

Things like the quantity traded might be a Power law distribution. For these you might want to take the log() before calculating the mean and sd. That is the z score is z(qty) = (log(qty)-mean(log(quantities)))/sd(log(quantities)).

Then just add up the z-score for each attribute.

To do this for each attribute you will need to have an idea of its distribution. You could guess but the best way is plot a graph and have a look. You might also want to plot graphs on log scales. See wikipedia for a long list.