Choosing an attractive linear scale for a graph's Y Axis

Clinton Pierce picture Clinton Pierce · Nov 28, 2008 · Viewed 42.5k times · Source

I'm writing a bit of code to display a bar (or line) graph in our software. Everything's going fine. The thing that's got me stumped is labeling the Y axis.

The caller can tell me how finely they want the Y scale labeled, but I seem to be stuck on exactly what to label them in an "attractive" kind of way. I can't describe "attractive", and probably neither can you, but we know it when we see it, right?

So if the data points are:

   15, 234, 140, 65, 90

And the user asks for 10 labels on the Y axis, a little bit of finagling with paper and pencil comes up with:

  0, 25, 50, 75, 100, 125, 150, 175, 200, 225, 250

So there's 10 there (not including 0), the last one extends just beyond the highest value (234 < 250), and it's a "nice" increment of 25 each. If they asked for 8 labels, an increment of 30 would have looked nice:

  0, 30, 60, 90, 120, 150, 180, 210, 240

Nine would have been tricky. Maybe just have used either 8 or 10 and call it close enough would be okay. And what to do when some of the points are negative?

I can see Excel tackles this problem nicely.

Does anyone know a general-purpose algorithm (even some brute force is okay) for solving this? I don't have to do it quickly, but it should look nice.

Answer

Toon Krijthe picture Toon Krijthe · Nov 28, 2008

A long time ago I have written a graph module that covered this nicely. Digging in the grey mass gets the following:

  • Determine lower and upper bound of the data. (Beware of the special case where lower bound = upper bound!
  • Divide range into the required amount of ticks.
  • Round the tick range up into nice amounts.
  • Adjust the lower and upper bound accordingly.

Lets take your example:

15, 234, 140, 65, 90 with 10 ticks
  1. lower bound = 15
  2. upper bound = 234
  3. range = 234-15 = 219
  4. tick range = 21.9. This should be 25.0
  5. new lower bound = 25 * round(15/25) = 0
  6. new upper bound = 25 * round(1+235/25) = 250

So the range = 0,25,50,...,225,250

You can get the nice tick range with the following steps:

  1. divide by 10^x such that the result lies between 0.1 and 1.0 (including 0.1 excluding 1).
  2. translate accordingly:
    • 0.1 -> 0.1
    • <= 0.2 -> 0.2
    • <= 0.25 -> 0.25
    • <= 0.3 -> 0.3
    • <= 0.4 -> 0.4
    • <= 0.5 -> 0.5
    • <= 0.6 -> 0.6
    • <= 0.7 -> 0.7
    • <= 0.75 -> 0.75
    • <= 0.8 -> 0.8
    • <= 0.9 -> 0.9
    • <= 1.0 -> 1.0
  3. multiply by 10^x.

In this case, 21.9 is divided by 10^2 to get 0.219. This is <= 0.25 so we now have 0.25. Multiplied by 10^2 this gives 25.

Lets take a look at the same example with 8 ticks:

15, 234, 140, 65, 90 with 8 ticks
  1. lower bound = 15
  2. upper bound = 234
  3. range = 234-15 = 219
  4. tick range = 27.375
    1. Divide by 10^2 for 0.27375, translates to 0.3, which gives (multiplied by 10^2) 30.
  5. new lower bound = 30 * round(15/30) = 0
  6. new upper bound = 30 * round(1+235/30) = 240

Which give the result you requested ;-).

------ Added by KD ------

Here's code that achieves this algorithm without using lookup tables, etc...:

double range = ...;
int tickCount = ...;
double unroundedTickSize = range/(tickCount-1);
double x = Math.ceil(Math.log10(unroundedTickSize)-1);
double pow10x = Math.pow(10, x);
double roundedTickRange = Math.ceil(unroundedTickSize / pow10x) * pow10x;
return roundedTickRange;

Generally speaking, the number of ticks includes the bottom tick, so the actual y-axis segments are one less than the number of ticks.