I'm trying to learn how to use scrapy and python but I'm not an expert at all... very far from here. I always have an empty file after crawling this page : product of c-discount and I don't understand why...
Here is my code :
import scrapy
from cdiscount_test.items import CdiscountTestItem
f = open('items.csv', 'w').close()
class CdiscountsellersspiderSpider(scrapy.Spider):
name = 'CDiscountSellersSpider'
allowed_domains = ['cdiscount.com']
start_urls = ['http://www.cdiscount.com/mpv-8732-SATENCO.html']
def parse(self, response):
items = CdiscountTestItem()
name = response.xpath('//div[@class="shtName"]/div[@class="shtOver"]/h1[@itemprop="name"]/text()').extract()
country = response.xpath('//div[@class="shtName"]/span[@class="shTopCExp"]/text()').extract()
items['name_seller'] = ''.join(name).strip()
items['country_seller'] = ''.join(country).strip()
pass
And the result I get in the cmd windows :
2017-06-20 18:01:50 [scrapy.core.engine] INFO: Spider opened
2017-06-20 18:01:50 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0
pages/min), scraped 0 items (at 0 items/min)
2017-06-20 18:01:50 [scrapy.extensions.telnet] DEBUG: Telnet console
listening on 127.0.0.1:6023
2017-06-20 18:01:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET
http://www.cdiscount.com/robots.txt> (referer: None)
2017-06-20 18:01:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET
http://www.cdiscount.com/mpv-8732-SATENCO.html> (referer: None)
2017-06-20 18:01:51 [scrapy.core.engine] INFO: Closing spider (finished)
Is there someone to help me please?
Thanks a lot!!!
One probable scenario for the same issue might be the website content is producing dynamically. You can check that by going to the website and tapping view page source. In such cases, you might have to use splash along with scrapy.