Possible Duplicate:
How can I speed up fetching pages with urllib2 in python?
I have a python script that download web page, parse it and return some value from the page. I need to scrape a few such pages for getting the final result. Every page retrieve takes long time (5-10s) and I'd prefer to make requests in parallel to decrease wait time.
The question is - which mechanism will do it quick, correctly and with minimal CPU/Memory waste? Twisted, asyncore, threading, something else? Could you provide some link with examples?
Thanks
UPD: There's a few solutions for the problem, I'm looking for the compromise between speed and resources. If you could tell some experience details - how it's fast under load from your view, etc - it would be very helpful.
multiprocessing.Pool can be a good deal, there are some useful examples. For example if you have a list of urls, you can map the contents retrieval in a concurrent way:
def process_url(url):
# Do what you want
return what_you_want
pool = multiprocessing.Pool(processes=4) # how much parallelism?
pool.map(process_url, list_of_urls)