Cloudant Query & CouchDB Mango: How to set $regex flags?

Colin Skow picture Colin Skow · May 11, 2016 · Viewed 7k times · Source

Is it possible to set $regex flags using Cloudant Query / CouchDB 2.0 Find?

Specifically I want a case insensitive search and global would also be useful.

In JavaScript I would do:

db.find({
    selector: {
      _id: {$gt: null},
      series: {$regex: /mario/i}
    }
  });

But I have no clue how to code that into an Erlang string.

Answer

Colin Skow picture Colin Skow · May 14, 2016

From Cloudant Support:

I understand that you wish to do a case-insensitive match using the $regex operator in Cloudant Query.

As an example, you can use this Cloudant Query selector to get all documents in which the "series" field has a string value in which there is a case-insensitive match with the string "mario":

{
  "selector": {
    "_id": {
      "$gt": null
    },
    "series": {
      "$regex": "(?i)mario"
    }
  }
}

Using that selector in a file called query.txt, and with appropriate values set for $ACCOUNTNAME, $DATABASE, $USERNAME and $PASSWORD, you can run this query to get the correct result:

curl -X POST http://$ACCOUNTNAME.cloudant.com/$DATABASE/_find -H 
  "Content-Type: application/json" -d @query.txt -u $USERNAME:$PASSWORD

The Cloudant API Reference at https://docs.cloudant.com/cloudant_query.html#creating-selector-expressions says of the $regex operator in Cloudant Query selectors:

Most selector expressions work exactly as you would expect for the given operator. The matching algorithms used by the $regex operator are currently based on the Perl Compatible Regular Expression (PCRE) library. However, not all of the PCRE library is implemented, and some parts of the $regex operator go beyond what PCRE offers. For more information about what is implemented, see the Erlang Regular Expression information http://erlang.org/doc/man/re.html.

And in the Erlang Regular Expression information that it refers to at http://erlang.org/doc/man/re.html it says in the list of options for: compile(Regexp, Options) -> {ok, MP} | {error, ErrSpec}

Caseless

  • Letters in the pattern match both upper and lower case letters.

  • It is equivalent to Perl's /i option, and it can be changed within a pattern by a (?i) option setting.

  • Uppercase and lowercase letters are defined as in the ISO-8859-1 character set.

I hope this helps.