Elasticsearch search query to retrieve all records NEST

ASN picture ASN · Jun 13, 2016 · Viewed 17.4k times · Source

I have few documents in a folder and I want to check if all the documents in this folder are indexed or not. To do so, for each document name in the folder, I would like to run through a loop for the documents indexed in ES and compare. So I want to retrieve all the documents.

There are few other possible duplicates of the same question like retrieve all records in a (ElasticSearch) NEST query and enter link description here but they didnt help me as the documentation has changed from that time.(there is nothing about scan in the current documentation)

I tried using client.search<T>() . But as per the documentation, a default number of 10 results are retrieved. I would like to get all the records without mentioning the size of records ? (Because the size of the index changes)

Or is it possible to get the size of the index first and then send this number as input to the size to get all the documents and loop through?

Answer

ASN picture ASN · Jun 14, 2016

Here is how I solved my problem. Hope this helps. (References https://www.elastic.co/guide/en/elasticsearch/client/net-api/1.x/scroll.html , https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-search-context)

List<string> indexedList = new List<string>();
var scanResults = client.Search<ClassName>(s => s
                .From(0)
                .Size(2000)
                .MatchAll()
                .Fields(f=>f.Field(fi=>fi.propertyName)) //I used field to get only the value I needed rather than getting the whole document
                .SearchType(Elasticsearch.Net.SearchType.Scan)
                .Scroll("5m")
            );

        var results = client.Scroll<ClassName>("10m", scanResults.ScrollId);
        while (results.Documents.Any())
        {
            foreach(var doc in results.Fields)
            {
                indexedList.Add(doc.Value<string>("propertyName"));
            }

            results = client.Scroll<ClassName>("10m", results.ScrollId);
        }

EDIT

var response = client.Search<Document>(s => s
                         .From(fromNum)
                         .Size(PageSize)
                         .Query(q => q ....