Test for null in function with varying parameters

milovanderlinden picture milovanderlinden · Jun 27, 2013 · Viewed 21.2k times · Source

I have a Postgres function:

create function myfunction(integer, text, text, text, text, text, text) RETURNS 
table(id int, match text, score int, nr int, nr_extra character varying, info character varying, postcode character varying,
street character varying, place character varying, country character varying, the_geom geometry)
AS $$
BEGIN

return query (select a.id, 'address' as match, 1 as score, a.ad_nr, a.ad_nr_extra,a.ad_info,a.ad_postcode, s.name as street, p.name place , c.name country, a.wkb_geometry as wkb_geometry from "Addresses" a 
    left join "Streets" s on a.street_id = s.id 
        left join "Places" p on s.place_id = p.id 
            left join "Countries" c on p.country_id = c.id 
            where c.name = $7 
                and p.name = $6
                    and s.name = $5
                    and a.ad_nr = $1 
                    and a.ad_nr_extra = $2
                    and a.ad_info = $3
                    and ad_postcode = $4);
END;
$$
LANGUAGE plpgsql;

This function fails to give the right result when one or more of the variables entered are NULL because ad_postcode = NULL will fail.

What can I do to test for NULL inside the query?

Answer

Erwin Brandstetter picture Erwin Brandstetter · Jun 28, 2013

I disagree with some of the advice in other answers. This can be done with PL/pgSQL and I think it is mostly far superior to assembling queries in a client application. It is faster and cleaner and the app only sends the bare minimum across the wire in requests. SQL statements are saved inside the database, which makes it easier to maintain - unless you want to collect all business logic in the client application, this depends on the general architecture.

PL/pgSQL function with dynamic SQL

CREATE OR REPLACE FUNCTION func(
      _ad_nr       int  = NULL
    , _ad_nr_extra text = NULL
    , _ad_info     text = NULL
    , _ad_postcode text = NULL
    , _sname       text = NULL
    , _pname       text = NULL
    , _cname       text = NULL)
  RETURNS TABLE(id int, match text, score int, nr int, nr_extra text
              , info text, postcode text, street text, place text
              , country text, the_geom geometry)
  LANGUAGE plpgsql AS
$func$
BEGIN
   -- RAISE NOTICE '%', -- for debugging
   RETURN QUERY EXECUTE concat(
   $$SELECT a.id, 'address'::text, 1 AS score, a.ad_nr, a.ad_nr_extra
        , a.ad_info, a.ad_postcode$$

   , CASE WHEN (_sname, _pname, _cname) IS NULL THEN ', NULL::text' ELSE ', s.name' END  -- street
   , CASE WHEN (_pname, _cname) IS NULL         THEN ', NULL::text' ELSE ', p.name' END  -- place
   , CASE WHEN _cname IS NULL                   THEN ', NULL::text' ELSE ', c.name' END  -- country
   , ', a.wkb_geometry'

   , concat_ws('
   JOIN   '
   , '
   FROM   "Addresses" a'
   , CASE WHEN NOT (_sname, _pname, _cname) IS NULL THEN '"Streets"   s ON s.id = a.street_id' END
   , CASE WHEN NOT (_pname, _cname) IS NULL         THEN '"Places"    p ON p.id = s.place_id' END
   , CASE WHEN _cname IS NOT NULL                   THEN '"Countries" c ON c.id = p.country_id' END
   )

   , concat_ws('
   AND    '
      , '
   WHERE  TRUE'
      , CASE WHEN $1 IS NOT NULL THEN 'a.ad_nr = $1' END
      , CASE WHEN $2 IS NOT NULL THEN 'a.ad_nr_extra = $2' END
      , CASE WHEN $3 IS NOT NULL THEN 'a.ad_info = $3' END
      , CASE WHEN $4 IS NOT NULL THEN 'a.ad_postcode = $4' END
      , CASE WHEN $5 IS NOT NULL THEN 's.name = $5' END
      , CASE WHEN $6 IS NOT NULL THEN 'p.name = $6' END
      , CASE WHEN $7 IS NOT NULL THEN 'c.name = $7' END
   )
   )
   USING $1, $2, $3, $4, $5, $6, $7;
END
$func$;

Call:

SELECT * FROM func(1, '_ad_nr_extra', '_ad_info', '_ad_postcode', '_sname');

SELECT * FROM func(1, _pname := 'foo');

Since all function parameters have default values, you can use positional notation, named notation or mixed notation at your choosing in the function call. See:

More explanation for basics of dynamic SQL:

The concat() function is instrumental for building the string. It was introduced with Postgres 9.1.

The ELSE branch of a CASE statement defaults to NULL when not present. Simplifies the code.

The USING clause for EXECUTE makes SQL injection impossible as values are passed as values and allows to use parameter values directly, exactly like in prepared statements.

NULL values are used to ignore parameters here. They are not actually used to search.

You don't need parentheses around the SELECT with RETURN QUERY.

Simple SQL function

You could do it with a plain SQL function and avoid dynamic SQL. For some cases this may be faster. But I wouldn't expect it in this case. Planning the query without unnecessary joins and predicates typically produces best results. Planning cost for a simple query like this is almost negligible.

CREATE OR REPLACE FUNCTION func_sql(
     _ad_nr       int  = NULL
   , _ad_nr_extra text = NULL
   , _ad_info     text = NULL
   , _ad_postcode text = NULL
   , _sname       text = NULL
   , _pname       text = NULL
   , _cname       text = NULL)
  RETURNS TABLE(id int, match text, score int, nr int, nr_extra text
              , info text, postcode text, street text, place text
              , country text, the_geom geometry)
  LANGUAGE sql AS 
$func$
SELECT a.id, 'address' AS match, 1 AS score, a.ad_nr, a.ad_nr_extra
     , a.ad_info, a.ad_postcode
     , s.name AS street, p.name AS place
     , c.name AS country, a.wkb_geometry
FROM   "Addresses"      a
LEFT   JOIN "Streets"   s ON s.id = a.street_id
LEFT   JOIN "Places"    p ON p.id = s.place_id
LEFT   JOIN "Countries" c ON c.id = p.country_id
WHERE ($1 IS NULL OR a.ad_nr = $1)
AND   ($2 IS NULL OR a.ad_nr_extra = $2)
AND   ($3 IS NULL OR a.ad_info = $3)
AND   ($4 IS NULL OR a.ad_postcode = $4)
AND   ($5 IS NULL OR s.name = $5)
AND   ($6 IS NULL OR p.name = $6)
AND   ($7 IS NULL OR c.name = $7)
$func$;

Identical call.

To effectively ignore parameters with NULL values:

($1 IS NULL OR a.ad_nr = $1)

To actually use NULL values as parameters, use this construct instead:

($1 IS NULL AND a.ad_nr IS NULL OR a.ad_nr = $1)  -- AND binds before OR

This also allows for indexes to be used.
For the case at hand, replace all instances of LEFT JOIN with JOIN.

db<>fiddle here - with simple demo for all variants.
Old sqlfiddle

Asides

  • Don't use name and id as column names. They are not descriptive and when you join a bunch of tables (like you do to a lot in a relational database), you end up with several columns all named name or id, and have to attach aliases to sort the mess.

  • Please format your SQL properly, at least when asking public questions. But do it privately as well, for your own good.