I am receiving a 302 response from a server while scrapping a website:
2014-04-01 21:31:51+0200 [ahrefs-h] DEBUG: Redirecting (302) to <GET http://www.domain.com/Site_Abuse/DeadEnd.htm> from <GET http://domain.com/wps/showmodel.asp?Type=15&make=damc&a=664&b=51&c=0>
I want to send request to GET urls instead of being redirected. Now I found this middleware:
https://github.com/scrapy/scrapy/blob/master/scrapy/contrib/downloadermiddleware/redirect.py#L31
I added this redirect code to my middleware.py file and I added this into settings.py:
DOWNLOADER_MIDDLEWARES = {
'street.middlewares.RandomUserAgentMiddleware': 400,
'street.middlewares.RedirectMiddleware': 100,
'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
}
But I am still getting redirected. Is that all I have to do in order to get this middleware working? Do I miss something?
Forgot about middlewares in this scenario, this will do the trick:
meta = {'dont_redirect': True,'handle_httpstatus_list': [302]}
That said, you will need to include meta parameter when you yield your request:
yield Request(item['link'],meta = {
'dont_redirect': True,
'handle_httpstatus_list': [302]
}, callback=self.your_callback)