Histogram in JavaScript?

dani picture dani · Nov 19, 2011 · Viewed 23.6k times · Source

I have this dataset for income:

Income      Number of people
0           245981
8.8         150444
30          126063
49.9        123519
70          115029
90.7        277149
109.1       355768
130         324246
150.3       353239
170.2       396008
190         396725
210         398640
230.1       401932
250         416079
270         412727
289.8       385192
309.7       343178
329.7       293707
349.6       239982
369.7       201557
389.3       165132
442.3       442075
543.4       196526
679.9       146784
883.9       48600
1555        44644

(As you can see, the width between income levels gets larger towards the end.)

  1. How do I make an accurate histogram of this data in JavaScript? (On a linear x-axis scale with a range from for example 0 - 2000)
  2. How do I factor out the number of people to show only percentages at different intervals?
  3. If I'd like to place exactly 100 symbols representing the data, how do I decide where to place them?

Answer

mbostock picture mbostock · Jan 17, 2012

The existing histogram examples are based on computing the histogram from samples, say if you had a list of individual people and their incomes. In this case, you already have the data for the histogram—you just want to display it.

The tricky thing here is that your histogram has variable-width bins. The first thing you can do is ignore the variable-width of each bin and just display a simple lollipop chart. The x-axis is a linear scale for income, and the y-axis is a linear scale for count of people:

http://bl.ocks.org/1624656

If you want to convert this to a histogram, you can't just replace those fixed-width lines with variable-width bars; you need to normalize the data so that the area of the bar encodes the frequency of people with that income. Therefore, the width of the bar is the income range (such as from 0 to 8.8 for the first bin), and the height of the bar is the quantity of people divided by the width. As a result, the area (width × height) is proportional to the number of people. That looks like this:

http://bl.ocks.org/1624660