Insert multiple rows in one table based on number in another table

jO. picture jO. · Oct 2, 2013 · Viewed 19.8k times · Source

I am creating a database for the first time using Postgres 9.3 on MacOSX.

Let's say I have table A and B. A starts off as empty and B as filled. I would like the number of entries in column all_names in table B to equal the number for each names in table A like table B below. Thus names should contain each unique entry from all_names and number its count. I am not used to the syntax, yet, so I do not really know how to go about it. The birthday column is redundant.

Table A

names | number
------+--------
Carl  | 3
Bill  | 4
Jen   | 2

Table B

 all_names | birthday
-----------+------------
Carl       | 17/03/1980
Carl       | 22/08/1994
Carl       | 04/09/1951
Bill       | 02/12/2003
Bill       | 11/03/1975
Bill       | 04/06/1986
Bill       | 08/07/2005
Jen        | 05/03/2009
Jen        | 01/04/1945

Would this be the correct way to go about it?

insert into a (names, number)
select b.all_names, count(b.all_names)
from b
group by b.all_names;

Answer

Erwin Brandstetter picture Erwin Brandstetter · Oct 2, 2013

Answer to original question

Postgres allows set-returning functions (SRF) to multiply rows. generate_series() is your friend:

INSERT INTO b (all_names, birthday)
SELECT names, current_date -- AS birthday ??
FROM  (SELECT names, generate_series(1, number) FROM a);

Since the introduction of LATERAL in Postgres 9.3 you can do stick to standard SQL: the SRF moves from the SELECT to the FROM list:

INSERT INTO b (all_names, birthday)
SELECT a.names, current_date -- AS birthday ??
FROM   a, generate_series(1, a.number) AS rn

LATERAL is implicit here, as explained in the manual:

LATERAL can also precede a function-call FROM item, but in this case it is a noise word, because the function expression can refer to earlier FROM items in any case.

Reverse operation

The above is the reverse operation (approximately) of a simple aggregate count():

INSERT INTO a (name, number)
SELECT all_names, count(*)
FROM   b
GROUP  BY 1;

... which fits your updated question.

Note a subtle difference between count(*) and count(all_names). The former counts all rows, no matter what, while the latter only counts rows where all_names IS NOT NULL. If your column all_names is defined as NOT NULL, both return the same, but count(*) is a bit shorter and faster.

About GROUP BY 1: