What is faster:
(A) 'Unpickling' (Loading) a pickled dictionary object, using pickle.load()
or
(B) Loading a JSON file to a dictionary using simplejson.load()
Assuming: The pickled object file exists already in case A, and that the JSON file exists already in case B.
The speed actually depends on the data, it's content and size.
But, anyway, let's take an example json data and see what is faster (Ubuntu 12.04, python 2.7.3) :
Giving this json structure dumped into test.json
and test.pickle
files:
{
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-markup language, used to create markup languages such as DocBook.",
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}
Testing script:
import timeit
import pickle
import cPickle
import json
import simplejson
import ujson
import yajl
def load_pickle(f):
return pickle.load(f)
def load_cpickle(f):
return cPickle.load(f)
def load_json(f):
return json.load(f)
def load_simplejson(f):
return simplejson.load(f)
def load_ujson(f):
return ujson.load(f)
def load_yajl(f):
return yajl.load(f)
print "pickle:"
print timeit.Timer('load_pickle(open("test.pickle"))', 'from __main__ import load_pickle').timeit()
print "cpickle:"
print timeit.Timer('load_cpickle(open("test.pickle"))', 'from __main__ import load_cpickle').timeit()
print "json:"
print timeit.Timer('load_json(open("test.json"))', 'from __main__ import load_json').timeit()
print "simplejson:"
print timeit.Timer('load_simplejson(open("test.json"))', 'from __main__ import load_simplejson').timeit()
print "ujson:"
print timeit.Timer('load_ujson(open("test.json"))', 'from __main__ import load_ujson').timeit()
print "yajl:"
print timeit.Timer('load_yajl(open("test.json"))', 'from __main__ import load_yajl').timeit()
Output:
pickle:
107.936687946
cpickle:
28.4231381416
json:
31.6450419426
simplejson:
20.5853149891
ujson:
16.9352178574
yajl:
18.9763481617
As you can see, unpickling via pickle
is not that fast at all - cPickle
is definetely the way to go if you choose pickling/unpickling option. ujson
looks promising among these json parsers on this particular data.
Also, json
and simplejson
libraries load much faster on pypy (see Python JSON Performance).
See also:
It's important to note that the results may differ on your particular system, on other type and size of data.