Convert "regexp_substr" (Oracle ) to PostgreSQL

Catalin Vladu picture Catalin Vladu · Aug 18, 2015 · Viewed 10.1k times · Source

I have the query In Oracle SQL:

       select town_name, 
              regexp_substr(town_name, '[^A,]+', 1, 1) as c1,
              regexp_substr(town_name, '[^A,]+', 1, 2) as c2, 
              regexp_substr(town_name, '[^A,]+', 1, rownum) as c_rownum,
              rownum
          from epr_towns

The first 2 rows from the result are:

VALGANNA        V   LG  V   1
VARANO BORGHI   V   R   R   2

I need to obtain the same result on PostgreSQL (for the row with regexp_substr(town_name, '[^A,]+', 1, rownum) as c_rownum), and I don't know how. Could you help me? Thanks.

Answer

wrschneider picture wrschneider · Jul 5, 2018

There's really two separate problems here

  • replacing rownum
  • replacing regexp_substr with regexp_matches

To solve for rownum, use a CTE (WITH clause) to add a rownum-like column to your underlying table.

regexp_matches works a little differently than Oracle regexp_substr. While Oracle regexp_substr takes the nth match as an argument, PostgreSQL regexp_matches will return all matches as a table-valued function. So you have to wrap the call in a subquery with limit/offset to pluck out the nth match. Also, the rows returned by regexp_substr are arrays, so assuming you have no parenthesized expressions in your regexp, you need to index/dereference the first item in the array.

End result looks like this:

http://sqlfiddle.com/#!17/865ee/7

 with epr_towns_rn as (
    select town_name,
      row_number() over(order by town_name) as rn
  from epr_towns
)
select town_name,
   (select (regexp_matches(town_name, '[^A,]+', 'g'))[1] offset 0 limit 1) as c1,
   (select (regexp_matches(town_name, '[^A,]+', 'g'))[1] offset 1 limit 1) as c2,
   (select (regexp_matches(town_name, '[^A,]+', 'g'))[1] offset rn-1 limit 1)
     as c_rownum,
   rn
   from epr_towns_rn;

If you only wanted the first match, you could leave out the 'g' argument and leave out the limit/offset from the subquery but you still need the subquery wrapper in case there's no match, to mimic regexp_substr returning null when no match.