First of all, sorry for my bad English. In my project i have a lot of I/O network requests. Main data stored in another projects, and access is provided by web API (JSON/XML), polling. We use this APIs for each new user session (getting information about user). And sometimes, we have a problem with waiting for a response. We use nginx+uwsgi+django. As you know, Django is synchronous (or blocking). We use uwsgi with multithreading for solve problem with network IO waiting. I decided to read about gevent. I understand difference between cooperative and preemptive multitasking. And I hoped that gevent was better solution then uwsgi threads for this issue (network I/O bottleneck). But the results were almost identical. Sometimes gevent was weaker. Maybe somewhere I'm wrong. Tell me, please.
Here is uwsgi config examples. Gevent:
$ uwsgi --http :8001 --module ugtest.wsgi --gevent 40 --gevent-monkey-patch
Threading:
$ uwsgi --http :8001 --module ugtest.wsgi --enable-threads --threads 40
Controller example:
def simple_test_action(request):
# get data from API without parsing (only for simple I/O test)
data = _get_data_by_url(API_URL)
return JsonResponse(data, safe=False)
import httplib
from urlparse import urlparse
def _get_data_by_url(url):
u = urlparse(url)
if str(u.scheme).strip().lower() == 'https':
conn = httplib.HTTPSConnection(u.netloc)
else:
conn = httplib.HTTPConnection(u.netloc)
path_with_params = '%s?%s' % (u.path, u.query, )
conn.request("GET", path_with_params)
resp = conn.getresponse()
print resp.status, resp.reason
body = resp.read()
return body
Test (with geventhttpclient):
def get_info(i):
url = URL('http://localhost:8001/simpletestaction/')
http = HTTPClient.from_url(url, concurrency=100, connection_timeout=60, network_timeout=60)
try:
response = http.get(url.request_uri)
s = response.status_code
body = response.read()
finally:
http.close()
dt_start = dt.now()
print 'Start: %s' % dt_start
threads = [gevent.spawn(get_info, i) for i in xrange(401)]
gevent.joinall(threads)
dt_end = dt.now()
print 'End: %s' % dt_end
print dt_end-dt_start
In both cases i have a similar time. What are the advantages of a gevent/greenlets and cooperative multitasking in a similar issue (API proxying)?
A concurrency of 40 is not such a level to let gevent shines. Gevent is about concurrency not parallelism (or per-request performance), so having such a "low" level of concurrency is not a good way to get improvements.
Generally you will see gevent concurrency with a level of thousands, not 40 :)
For blocking I/O python threads are not bad (the GIL is released during I/O), the advantage of gevent is in resource usage (having 1000 python threads will be overkill) and the removal of the need to think about locking and friends.
And obviously, remember that your whole app must be gevent-friendly to get an advantage, and django (by default) requires a bit of tuning (as an example database adapters must be changed with something gevent friendly).