PostgreSQL: ERROR: 42601: a column definition list is required for functions returning "record"

Jeremy Holovacs picture Jeremy Holovacs · Dec 22, 2011 · Viewed 50.8k times · Source

As far as I can tell, my function properly resembles the samples I've seen. Can someone clue me in as to how I get this to work?

create or replace function get_user_by_username(
    username varchar(250),
    online boolean
    ) returns setof record as $$
declare result record;
begin

    if online then 
        update users
        set last_activity = current_timestamp
        where user_name = username;
    end if;

    return query
    select
        user_id,
        user_name,
        last_activity,
        created,
        email,
        approved,
        last_lockout,
        last_login,
        last_password_changed,
        password_question,
        comment
    from
        users
    where
        user_name = username
    limit 1;

    return;
end;
$$ language plpgsql;

Answer

Erwin Brandstetter picture Erwin Brandstetter · Dec 23, 2011

Return selected columns

CREATE OR REPLACE FUNCTION get_user_by_username(_username text
                                              , _online bool DEFAULT false)
  RETURNS TABLE (
    user_id int
  , user_name varchar
  , last_activity timestamptz
  )
  LANGUAGE plpgsql AS
$func$
BEGIN
   IF _online THEN
      RETURN QUERY
      UPDATE users u 
      SET    last_activity = current_timestamp  -- ts with time zone
      WHERE  u.user_name = _username
      RETURNING u.user_id
              , u.user_name
              , u.last_activity;
   ELSE
      RETURN QUERY
      SELECT u.user_id
           , u.user_name
           , u.last_activity
      FROM   users u
      WHERE  u.user_name = _username;
   END IF;
END
$func$;

Call:

SELECT * FROM get_user_by_username('myuser', true);

You had DECLARE result record; but didn't use the variable. I deleted the cruft.

You can return the record directly from the UPDATE, which is much faster than calling an additional SELECT statement. Use RETURN QUERY and UPDATE with a RETURNING clause.

If the user is not _online, default to a plain SELECT. This is also the (safe) default if the second parameter is omitted - which is only possible after providing that default with DEFAULT false in the function definition.

If you don't table-qualify column names (tablename.columnname) in queries inside the function, be wary of naming conflicts between column names and named parameters, which are visible (most) everywhere inside a function.
You can also avoid such conflicts by using positional references ($n) for parameters. Or use a prefix that you never use for column names: like an underscore (_username).

If users.username is defined unique in your table, then LIMIT 1 in the second query is just cruft. If it is not, then the UPDATE can update multiple rows, which is most likely wrong. I assume a unique username and trim the noise.

Define the return type of the function (like @ertx demonstrated) or you have to provide a column definition list with every function call, which is awkward.

Creating a type for that purpose (like @ertx proposed) is a valid approach, but probably overkill for a single function. That was the way to go in old versions of Postgres before we had RETURNS TABLE for that purpose - like demonstrated above.

You do not need a loop for this simple function.

Every function needs a language declaration. LANGUAGE plpgsql in this case.

I use timestamptz (timestamp with time zone) instead of timestamp (timestamp without time zone), which is the sane default. See:

Return (set of) whole row(s)

To return all columns of the existing table users, there is a simpler way. Postgres automatically defines a composite type of the same name for every table. Just use RETURNS SETOF users to vastly simplify the query:

CREATE OR REPLACE FUNCTION get_user_by_username(_username text
                                              , _online bool DEFAULT false)
  RETURNS SETOF users
  LANGUAGE plpgsql AS
$func$
BEGIN
   IF _online THEN
      RETURN QUERY
      UPDATE users u 
      SET    last_activity = current_timestamp
      WHERE  u.user_name = _username
      RETURNING u.*;
   ELSE
      RETURN QUERY
      SELECT *
      FROM   users u
      WHERE  u.user_name = _username;
   END IF;
END
$func$;

Return whole row plus custom addition

To address the question added by TheRealChx101 in a comment below:

What if you also have a calculated value in addition to a whole table? 😑

Not as simple, but doable. We can send the whole row type as one field, and add more:

CREATE OR REPLACE FUNCTION get_user_by_username3(_username text
                                               , _online bool DEFAULT false)
  RETURNS TABLE (
    users_row users
  , custom_addition text
  )
  LANGUAGE plpgsql AS
$func$
BEGIN
   IF _online THEN
      RETURN QUERY
      UPDATE users u 
      SET    last_activity = current_timestamp  -- ts with time zone
      WHERE  u.user_name = _username
      RETURNING u  -- whole row
              , u.user_name || u.user_id;
   ELSE
      RETURN QUERY
      SELECT u, u.user_name || u.user_id
      FROM   users u
      WHERE  u.user_name = _username;
   END IF;
END
$func$;

The "magic" is in the function call, where we (optionally) decompose the row type:

SELECT (users_row).*, custom_addition FROM get_user_by_username('foo', true);

db<>fiddle here (showing all)


If you need something more "dynamic", consider: