Transform table to one-hot-encoding of single column value

Ernest picture Ernest · Aug 10, 2017 · Viewed 7.8k times · Source

I have a table with two columns:

+---------+--------+
| keyword | color  |
+---------+--------+
| foo     | red    |
| bar     | yellow |
| fobar   | red    |
| baz     | blue   |
| bazbaz  | green  |
+---------+--------+

I need to do some kind of one-hot encoding and transform table in PostgreSQL to:

+---------+-----+--------+-------+------+
| keyword | red | yellow | green | blue |
+---------+-----+--------+-------+------+
| foo     |   1 |      0 |     0 |    0 |
| bar     |   0 |      1 |     0 |    0 |
| fobar   |   1 |      0 |     0 |    0 |
| baz     |   0 |      0 |     0 |    1 |
| bazbaz  |   0 |      0 |     1 |    0 |
+---------+-----+--------+-------+------+

Is it possible to do with SQL only? Any tips on how to get started?

Answer

Oto Shavadze picture Oto Shavadze · Aug 10, 2017

If I correctly understand, you need conditional aggregation:

select keyword,
count(case when color = 'red' then 1 end) as red,
count(case when color = 'yellow' then 1 end) as yellow
-- another colors here
from t
group by keyword