Using python dictionary as a temporary in-memory key-value database?

user1191575 picture user1191575 · Mar 21, 2012 · Viewed 8.1k times · Source

I need something like a temporary in-memory key-value store. I know there are solutions like Redis. But I wonder if using a python dictionary could work? And potentially be even faster? So think a Tornado (or similar) server running and holding a python dictionary in memory and just return the appropriate value based on the HTTP request.

Why I need this? As part of a service there are key values being stored but they have this property: the more recent they are the more likely they are to be accessed. So I want to keep say last 100 key values in memory (as well as writing to disk) for faster retrieval.

If the server dies the dictionary can be restored again from disk.

Has anyone done something like this? Am I totally missing something here?

PS: I think it's not possible with a WSGI server, right? Because as far as I know you can't keep something in memory between individual requests.

Answer

Mathias picture Mathias · Mar 21, 2012

I'd definitely work with memcached. Once it has been setup you can easily decorate your functions/methods like it's done in my example:

#!/usr/bin/env python

import time
import memcache
import hashlib

def memoize(f):

    def newfn(*args, **kwargs):
        mc = memcache.Client(['127.0.0.1:11211'], debug=0)
        # generate md5 out of args and function
        m = hashlib.md5()
        margs = [x.__repr__() for x in args]
        mkwargs = [x.__repr__() for x in kwargs.values()]
        map(m.update, margs + mkwargs)
        m.update(f.__name__)
        m.update(f.__class__.__name__)
        key = m.hexdigest()

        value = mc.get(key)
        if value:
            return value
        else:
            value = f(*args, **kwargs)
            mc.set(key, value, 60)
            return value
        return f(*args)

    return newfn

@memoize
def expensive_function(x):
    time.sleep(5)
    return x

if __name__ == '__main__':
    print expensive_function('abc')
    print expensive_function('abc')

Don't care about network latency since that kind of optimization will be a waste of your time.