How to prevent 'invalid input syntax for type json' in Postgres, when records contain a mix of json or strings

Allyl Isocyanate picture Allyl Isocyanate · Aug 31, 2017 · Viewed 40.2k times · Source

I have a text column that contains JSON and also plan text. I want to convert it to JSON, and then select a particular property. For example:

user_data
_________
{"user": {"name": "jim"}}
{"user": {"name": "sally"}}
some random data string

I've tried:

select user_data::json#>'{user,name}' from users

I get:

ERROR:  invalid input syntax for type json
DETAIL:  Token "some" is invalid.
CONTEXT:  JSON user_data, line 1: some...

Is it possible to prevent this?

Answer

Andy Carlson picture Andy Carlson · Aug 31, 2017

If you want to skip the rows with invalid JSON, you must first test if the text is valid JSON. You can do this by creating a function which will attempt to parse the value, and catch the exception for invalid JSON values.

CREATE OR REPLACE FUNCTION is_json(input_text varchar) RETURNS boolean AS $$
  DECLARE
    maybe_json json;
  BEGIN
    BEGIN
      maybe_json := input_text;
    EXCEPTION WHEN others THEN
      RETURN FALSE;
    END;

    RETURN TRUE;
  END;
$$ LANGUAGE plpgsql IMMUTABLE;

When you have that, you could use the is_json function in a CASE or WHERE clause to narrow down the valid values.

-- this can eliminate invalid values
SELECT user_data::json #> '{user,name}'
FROM users WHERE is_json(user_data);

-- or this if you want to fill will NULLs
SELECT
  CASE
    WHEN is_json(user_data)
      THEN user_data::json #> '{user,name}'
    ELSE
      NULL
  END
FROM users;