Filtering Redis Hash Entries

Paul picture Paul · Apr 15, 2011 · Viewed 9.8k times · Source

I'm using redis to store hashes with ~100k records per hash. I want to implement filtering (faceting) the records within a given hash. Note a hash entry can belong to n filters.

After reading this and this it looks like I should:

  1. Implement a sorted SET per filter. The values within the SET correspond to the keys within a HASH.
  2. Retrieve the HASH keys from the given filter SET.
  3. Once I have the HASH keys from the SET fetch the corresponding entries from the HASH. This should give me all entries that belong to the filter.

Firstly is the above approach correct at a high level?

Assuming the approach is OK the bit I'm missing is what's the most efficient implementation to retrieve the HASH entries? Am I right in thinking once I have the HASH keys I should then use a PIPELINE to queue multiple HGETALL commands passing through each HASH key? Is there a better approach?

My concern about using a PIPELINE is that I believe it will block all other clients while servicing the command. I'll be paging the filtered results with 500 results per page. With multiple browser based clients performing filtering, not to mention the back end processes that populate the SETs and HASHes it sounds like there's potential for a lot of contention if PIPELINE does block. Could anyone provide a view on this?

If it helps I'm using 2.2.4 redis, predis for the web clients and servicestack for the back end.

Thanks, Paul

Answer

Tom Clarkson picture Tom Clarkson · Apr 18, 2011

Individual operations do block, but it doesn't matter as they shouldn't be long running. It sounds like you are retrieving more information than you really need - HGETALL will return 100,000 items when you only need 500.

Sending 500 HGET operations may work (assuming the set stores both hash and key) though it's possible that using hashes at all is a case of premature optimization - you may be better off using regular keys and MGET.