I am interested to do web crawling. I was looking at solr
.
Does solr
do web crawling, or what are the steps to do web crawling?
Solr 5+ DOES in fact now do web crawling! http://lucene.apache.org/solr/
Older Solr versions do not do web crawling alone, as historically it's a search server that provides full text search capabilities. It builds on top of Lucene.
If you need to crawl web pages using another Solr project then you have a number of options including:
If you want to make use of the search facilities provided by Lucene or SOLR you'll need to build indexes from the web crawl results.
See this also: