Oracle SQL: Return first line of string using regexp_substr

bawpie picture bawpie · Jul 15, 2013 · Viewed 16.4k times · Source

I am trying to return the first line of text from a text box in an SQL query (oracle 11). The content of the text box looks like this:

   X WITHDRAWN

   Explanation.

I want to return the top line, i.e. the X WITHDRAWN. I'm not sure if I can specify to look at the first line only, or to just return all text before a carriage return - either would work.

I think I need to use regexp_substr but I'm not quite sure on the syntax. I have tried:

   regexp_substr(TABLE.TEXT,'^.*$')

but it didn't work, so any assistance would be much appreciated!

EDIT: The solution used:

   select regexp_substr(TABLE.TEXT, '[^,]+['||CHR(10)||']') from tab

EDIT: I noticed I was getting a mixture of line feed and carriage returns returned in my answer, so I've use the following solution to return just the text and no additional characters.

    select 
     replace(replace(regexp_substr(TABLE.TEXT, '[^,]+['||CHR(10)||']'),CHR(10),''),CHR(13),'') 
     from tab 

EDIT: Following @Ben's answer, I've amended my solution to the following:

select
initcap(replace(regexp_substr(TABLE.TEXT, '.*$', 1, 1, 'm'),CHR(13),''))
from tab

Answer

Ben picture Ben · Jul 15, 2013

Parado's regular expression matches everything that's not a comma multiple times followed by a carriage return. This means it won't work for a line-feed or if there's a comma in the text.

Oracle supports multi-line expressions using the m match parameter. When using this mode, $ matches the end of each line as well as the end of the string. You can use this to simply the expression massively to:

regexp_substr(str, '.*$', 1, 1, 'm')

That is match the first occurrence (the first line) of the string that matches anything, followed by the end of the string, counting from the first character.

As an example:

with strings as ( 
 select 'hi
         hi again' as str
   from dual
  union all
 select 'bye
         and again'
   from dual
        )
 select regexp_substr(str, '.*$', 1, 1, 'm')
   from strings