My task is crawling the google search results using headless webkit(PyQt4.QtWebkit) in python. The module was crawling the results fine using PyQt4.I should have to execute this script in amazon ec2.So,I should have to use Xvfb (no x server in ec2).
At the same time my module has to be executed in a loop.So, It was working fine for some iterations.After some looping module runs into "xvfb-run: error: Xvfb failed to start"
How it is supposed to solve?
This is my looping:
for i in range(10):
try:
query_dict["start"] = i * 10
url = base_url + ue(query_dict)
flag = True
while flag:
parsed_dict = main(url)
time.sleep(8.4)
flag = False
except:
pass
main(url) :
def main(url):
cmd = "xvfb-run python /home/shan/temp/hg_intcen/lib/webpage_scrapper.py"+" "+str(url)
print "Cmd EXE:"+ cmd
proc = subprocess.Popen(cmd,shell=True,stdin=subprocess.PIPE,stdout=subprocess.PIPE)
proc.wait()
sys.stdout.flush()
result = proc.stdout.readlines()
print "crawled: ",result[1]
return result
webpage_scrapper will fetch all the html results using pyqt4. How to avoid the xvfb failing for looping?
You need to add --auto-servernum
parameter for xvfb-run
. Otherwise, it tries to spawn Xvfb
on the same display (by default :99
), which will fail if you already have one running.