How to do Erlang pattern matching using regular expressions?

Bruno Rijsman picture Bruno Rijsman · Nov 2, 2009 · Viewed 10.8k times · Source

When I write Erlang programs which do text parsing, I frequently run into situations where I would love to do a pattern match using a regular expression.

For example, I wish I could do something like this, where ~ is a "made up" regular expression matching operator:

my_function(String ~ ["^[A-Za-z]+[A-Za-z0-9]*$"]) ->
    ....

I know about the regular expression module (re) but AFAIK you cannot call functions when pattern matching or in guards.

Also, I wish matching strings could be done in a case-insensitive way. This is handy, for example, when parsing HTTP headers, I would love to do something like this where "Str ~ {Pattern, Options}" means "Match Str against pattern Pattern using options Options":

handle_accept_language_header(Header ~ {"Accept-Language", [case_insensitive]}) ->
    ...

Two questions:

  1. How do you typically handle this using just standard Erlang? Is there some mechanism / coding style which comes close to this in terms of conciseness and easiness to read?

  2. Is there any work (an EEP?) going on in Erlang to address this?

Answer

Rob Charlton picture Rob Charlton · Nov 2, 2009

You really don't have much choice other than to run your regexps in advance and then pattern match on the results. Here's a very simple example that approaches what I think you're after, but it does suffer from the flaw that you need to repeat the regexps twice. You could make this less painful by using a macro to define each regexp in one place.

-module(multire).

-compile(export_all).

multire([],_) ->
    nomatch;
multire([RE|RegExps],String) ->
    case re:run(String,RE,[{capture,none}]) of
    match ->
        RE;
    nomatch ->
        multire(RegExps,String)
    end.


test(Foo) ->
    test2(multire(["^Hello","world$","^....$"],Foo),Foo).

test2("^Hello",Foo) ->
    io:format("~p matched the hello pattern~n",[Foo]);
test2("world$",Foo) ->
    io:format("~p matched the world pattern~n",[Foo]);
test2("^....$",Foo) ->
    io:format("~p matched the four chars pattern~n",[Foo]);
test2(nomatch,Foo) ->
    io:format("~p failed to match~n",[Foo]).