What is the difference between web-crawling and web-scraping?

wassimans picture wassimans · Dec 1, 2010 · Viewed 65.5k times · Source

Is there a difference between Crawling and Web-scraping?

If there's a difference, what's the best method to use in order to collect some web data to supply a database for later use in a customised search engine?

Answer

Ben picture Ben · Dec 1, 2010

Crawling would be essentially what Google, Yahoo, MSN, etc. do, looking for ANY information. Scraping is generally targeted at certain websites, for specfic data, e.g. for price comparison, so are coded quite differently.

Usually a scraper will be bespoke to the websites it is supposed to be scraping, and would be doing things a (good) crawler wouldn't do, i.e.:

  • Have no regard for robots.txt
  • Identify itself as a browser
  • Submit forms with data
  • Execute Javascript (if required to act like a user)