I've decided to use the Python logging module because the messages generated by Twisted on std error is too long, and I want to INFO
level meaningful messages such as those generated by the StatsCollector
to be written on a separate log file while maintaining the on screen messages.
from twisted.python import log
import logging
logging.basicConfig(level=logging.INFO, filemode='w', filename='buyerlog.txt')
observer = log.PythonLoggingObserver()
observer.start()
Well, this is fine, I've got my messages, but the downside is that I do not know the messages are generated by which spider! This is my log file, with "twisted" being displayed by %(name)s
:
INFO:twisted:Log opened.
2 INFO:twisted:Scrapy 0.12.0.2543 started (bot: property)
3 INFO:twisted:scrapy.telnet.TelnetConsole starting on 6023
4 INFO:twisted:scrapy.webservice.WebService starting on 6080
5 INFO:twisted:Spider opened
6 INFO:twisted:Spider opened
7 INFO:twisted:Received SIGINT, shutting down gracefully. Send again to force unclean shutdown
8 INFO:twisted:Closing spider (shutdown)
9 INFO:twisted:Closing spider (shutdown)
10 INFO:twisted:Dumping spider stats:
11 {'downloader/exception_count': 3,
12 'downloader/exception_type_count/scrapy.exceptions.IgnoreRequest': 3,
13 'downloader/request_bytes': 9973,
As compared to the messages generated from twisted on standard error:
2011-12-16 17:34:56+0800 [expats] DEBUG: number of rules: 4
2011-12-16 17:34:56+0800 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023
2011-12-16 17:34:56+0800 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2011-12-16 17:34:56+0800 [iproperty] INFO: Spider opened
2011-12-16 17:34:56+0800 [iproperty] DEBUG: Redirecting (301) to <GET http://www.iproperty.com.sg/> from <GET http://iproperty.com.sg>
2011-12-16 17:34:57+0800 [iproperty] DEBUG: Crawled (200) <
I've tried %(name)s, %(module)s amongst others but I don't seem to be able to show the spider name. Does anyone knows the answer?
EDIT:
the problem with using LOG_FILE
and LOG_LEVEL
in settings is that the lower level messages will not be shown on std error.
You want to use the ScrapyFileLogObserver
.
import logging
from scrapy.log import ScrapyFileLogObserver
logfile = open('testlog.log', 'w')
log_observer = ScrapyFileLogObserver(logfile, level=logging.DEBUG)
log_observer.start()
I'm glad you asked this question, I've been wanting to do this myself.