PostgreSQL - Writing dynamic sql in stored procedure that returns a result set

prince picture prince · Aug 14, 2012 · Viewed 47.3k times · Source

How can I write a stored procedure that contains a dynamically built SQL statement that returns a result set? Here is my sample code:

CREATE OR REPLACE FUNCTION reporting.report_get_countries_new (
  starts_with varchar,
  ends_with varchar
)
RETURNS TABLE (
  country_id integer,
  country_name varchar
) AS
$body$
DECLARE
  starts_with ALIAS FOR $1;
  ends_with ALIAS FOR $2;
  sql VARCHAR;
BEGIN

    sql = 'SELECT * FROM lookups.countries WHERE lookups.countries.country_name >= ' || starts_with ;

    IF ends_with IS NOT NULL THEN
        sql = sql || ' AND lookups.countries.country_name <= ' || ends_with ;
    END IF;

    RETURN QUERY EXECUTE sql;

END;
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100 ROWS 1000;

This code returns an error:

ERROR:  syntax error at or near "RETURN"
LINE 1: RETURN QUERY SELECT * FROM omnipay_lookups.countries WHERE o...
        ^
QUERY:  RETURN QUERY SELECT * FROM omnipay_lookups.countries WHERE omnipay_lookups.countries.country_name >= r
CONTEXT:  PL/pgSQL function "report_get_countries_new" line 14 at EXECUTE statement

I have tried other ways instead of this:

RETURN QUERY EXECUTE sql;

Way 1:

RETURN EXECUTE sql;

Way 2:

sql = 'RETURN QUERY SELECT * FROM....
/*later*/
EXECUTE sql;

In all cases without success.

Ultimately I want to write a stored procedure that contains a dynamic sql statement and that returns the result set from the dynamic sql statement.

Answer

Erwin Brandstetter picture Erwin Brandstetter · Aug 21, 2012

There is room for improvements:

CREATE OR REPLACE FUNCTION report_get_countries_new (starts_with text
                                                   , ends_with   text = NULL)
  RETURNS SETOF lookups.countries AS
$func$
DECLARE
   sql text := 'SELECT * FROM lookups.countries WHERE country_name >= $1';
BEGIN
   IF ends_with IS NOT NULL THEN
      sql := sql || ' AND country_name <= $2';
   END IF;

   RETURN QUERY EXECUTE sql
   USING starts_with, ends_with;
END
$func$ LANGUAGE plpgsql;
-- the rest is default settings

Major points

  • PostgreSQL 8.4 introduced the USING clause for EXECUTE, which is useful for several reasons. Recap in the manual:

    The command string can use parameter values, which are referenced in the command as $1, $2, etc. These symbols refer to values supplied in the USING clause. This method is often preferable to inserting data values into the command string as text: it avoids run-time overhead of converting the values to text and back, and it is much less prone to SQL-injection attacks since there is no need for quoting or escaping.

    IOW, it is safer and faster than building a query string with text representation of parameters, even when sanitized with quote_literal().
    Note that $1, $2 in the query string refer to the supplied values in the USING clause, not to the function parameters.

  • While you return SELECT * FROM lookups.countries, you can simplify the RETURN declaration like demonstrated:

    RETURNS SETOF lookups.countries
    

    In PostgreSQL there is a composite type defined for every table automatically. Use it. The effect is that the function depends on the type and you get an error message if you try to alter the table. Drop & recreate the function in such a case.

    This may or may not be desirable - generally it is! You want to be made aware of side effects if you alter tables. The way you have it, your function would break silently and raise an exception on it's next call.

  • If you provide an explicit default for the second parameter in the declaration like demonstrated, you can (but don't have to) simplify the call in case you don't want to set an upper bound with ends_with.

    SELECT * FROM report_get_countries_new('Zaire');
    

    instead of:

    SELECT * FROM report_get_countries_new('Zaire', NULL);
    

    Be aware of function overloading in this context.

  • Don't quote the language name 'plpgsql' even if that's tolerated (for now). It's an identifier.

  • You can assign a variable at declaration time. Saves an extra step.

  • Parameters are named in the header. Drop the nonsensical lines:

     starts_with ALIAS FOR $1;
     ends_with ALIAS FOR $2;