I am trying to get 25 random samples of 15,000 IDs from a table. Instead of manually pressing run every time, I'm trying to do a loop. Which I fully understand is not the optimum use of Postgres, but it is the tool I have. This is what I have so far:
for i in 1..25 LOOP
insert into playtime.meta_random_sample
select i, ID
from tbl
order by random() limit 15000
end loop
Procedural elements like loops are not part of the SQL language and can only be used inside the body of a procedural language function, procedure (Postgres 11 or later) or a DO
statement, where such additional elements are defined by the respective procedural language. The default is PL/pgSQL, but there are others.
Example with plpgsql:
DO
$do$
BEGIN
FOR i IN 1..25 LOOP
INSERT INTO playtime.meta_random_sample
(col_i, col_id) -- declare target columns!
SELECT i, id
FROM tbl
ORDER BY random()
LIMIT 15000;
END LOOP;
END
$do$;
For many tasks that can be solved with a loop, there is a shorter and faster set-based solution around the corner. Pure SQL equivalent for your example:
INSERT INTO playtime.meta_random_sample (col_i, col_id)
SELECT t.*
FROM generate_series(1,25) i
CROSS JOIN LATERAL (
SELECT i, id
FROM tbl
ORDER BY random()
LIMIT 15000
) t;
About generate_series()
:
About optimizing performance of random selections: