Scrapy - crawled (200) and referer : none

P.Postrique picture P.Postrique · Jun 20, 2017 · Viewed 9k times · Source

I'm trying to learn how to use scrapy and python but I'm not an expert at all... very far from here. I always have an empty file after crawling this page : product of c-discount and I don't understand why...

Here is my code :

import scrapy

from cdiscount_test.items import CdiscountTestItem

f = open('items.csv', 'w').close()

class CdiscountsellersspiderSpider(scrapy.Spider):
    name = 'CDiscountSellersSpider'
    allowed_domains = ['cdiscount.com']
    start_urls = ['http://www.cdiscount.com/mpv-8732-SATENCO.html']

    def parse(self, response):
        items = CdiscountTestItem()
        name = response.xpath('//div[@class="shtName"]/div[@class="shtOver"]/h1[@itemprop="name"]/text()').extract()
        country = response.xpath('//div[@class="shtName"]/span[@class="shTopCExp"]/text()').extract()

        items['name_seller'] = ''.join(name).strip()
        items['country_seller'] = ''.join(country).strip()
        pass

And the result I get in the cmd windows :

2017-06-20 18:01:50 [scrapy.core.engine] INFO: Spider opened
2017-06-20 18:01:50 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 
pages/min), scraped 0 items (at 0 items/min)
2017-06-20 18:01:50 [scrapy.extensions.telnet] DEBUG: Telnet console 
listening on 127.0.0.1:6023
2017-06-20 18:01:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET 
http://www.cdiscount.com/robots.txt> (referer: None)
2017-06-20 18:01:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET 
http://www.cdiscount.com/mpv-8732-SATENCO.html> (referer: None)
2017-06-20 18:01:51 [scrapy.core.engine] INFO: Closing spider (finished)

Is there someone to help me please?

Thanks a lot!!!

Answer

Gihan Gamage picture Gihan Gamage · Feb 16, 2020

One probable scenario for the same issue might be the website content is producing dynamically. You can check that by going to the website and tapping view page source. In such cases, you might have to use splash along with scrapy.