Export csv file from scrapy (not via command line)

Chris picture Chris · Aug 6, 2014 · Viewed 15.6k times · Source

I successfully tried to export my items into a csv file from the command line like:

   scrapy crawl spiderName -o filename.csv

My question is: What is the easiest solution to do the same in the code? I need this as i extract the filename from another file. End scenario should be, that i call

  scrapy crawl spiderName

and it writes the items into filename.csv

Answer

rocktheartsm4l picture rocktheartsm4l · Aug 6, 2014

Why not use an item pipeline?

WriteToCsv.py

   import csv
   from YOUR_PROJECT_NAME_HERE import settings

   def write_to_csv(item):
       writer = csv.writer(open(settings.csv_file_path, 'a'), lineterminator='\n')
       writer.writerow([item[key] for key in item.keys()])

   class WriteToCsv(object):
        def process_item(self, item, spider):
            write_to_csv(item)
            return item

settings.py

   ITEM_PIPELINES = { 'project.pipelines_path.WriteToCsv.WriteToCsv' : A_NUMBER_HIGHER_THAN_ALL_OTHER_PIPELINES}
   csv_file_path = PATH_TO_CSV

If you wanted items to be written to separate csv for separate spiders you could give your spider a CSV_PATH field. Then in your pipeline use your spiders field instead of path from setttigs.

This works I tested it in my project.

HTH

http://doc.scrapy.org/en/latest/topics/item-pipeline.html