How to create index search in CouchDB?

user2670818 picture user2670818 · May 18, 2016 · Viewed 14.3k times · Source

Assuming configuring couchDB locally, how and where to create the search index similarly to Cloudant on Bluemix?

enter image description here

Answer

user2670818 picture user2670818 · May 20, 2016

The solution I was searching for for was based on this library.

  1. I had to install CouchDB 1.6.1 to have available database on http://localhost:5984,
  2. the next step was to install couchdb-lucene, which was running on http://localhost:5985 with successfull response. It is maven based app.

{"couchdb-lucene":"Welcome","version":"1.1.0-SNAPSHOT"}

To make it run I had to build it in the root directory with mvn and then navigate to target and run command ./bin/run in the unzipped couchdb-lucene:

root@mario-VirtualBox:/home/mario/CouchDB_mario/couchdb-lucene/target/couchdb-lucene-1.1.0-SNAPSHOT# ./bin/run
  1. The next constraint was to connect these two servers together. And all I had to do was to map them via proxy in the /etc/couchdb/local.ini

All what you need to have there is the following piece of code:

[httpd_global_handlers]
_fti = {couch_httpd_proxy, handle_proxy_req, <<"http://localhost:5985">>}

Thanks to which, I was able to finally query CouchDB using Apache Lucene indexing.

  1. Before querying I had to insert my custom JSON Design Document, not new design through the UI, neither new view, but new JSON Document. Essentially hacking CouchDB a little bit with faked design so that could support Lucene search. I've used CURL request with the following format

curl -X PUT http://localhost:5984/user14169_slovnik_medical/_design/medical -d @user14169_slovnik_medical.json

Where the JSON Design Document looked like this:

 {
   "_id": "_design/medical",
   "fulltext": {
       "by_meaning": {
           "index": "function(doc) { var ret=new Document(); ret.add(doc.vyznam); return ret }"
       },
       "by_shortcut": {
           "index": "function(doc) { var ret=new Document(); ret.add(doc.zkratka); return ret }"
       }
   }
}
  1. As an example. Having this search index defined and let's say this type of data in the JSON Documents:
  {
     "_id": "63e5c848fa2211c3b063d6feccd3d942",
     "_rev": "1-899a6924ed08097b1a37e497d91726fd",
     "DATAWORKS_DOCUMENT_TYPE": "user14169_slovnik_medical",
     "vyznam": "End to side",
     "zkratka": "e-t-s"
   }

Then you are easiliy able to achieve queries like this:

http://localhost:5984/_fti/local/user14169_slovnik_medical/_design/medical/by_meaning?q=lob~

Which returns the expected data: enter image description here

The local prefix is because I am running the database on localhost on 1 node and by default couchdb-lucene is connecting to the localhost.

The coolest thing is that you are able to use client API org.lightcouch jar library in Java and do some easy calls like this:

CouchDbClient dbClient = new CouchDbClient("user14169_slovnik_medical", true, "http", "127.0.0.1", 5984, null, null);

String uriFullText = dbClient.getBaseUri() + "_fti/local/user14169_slovnik_medical/_design/medical/by_shortcut?q=lob*";

JsonObject result = dbClient.findAny(JsonObject.class, uriFullText);

System.out.println(result.toString());