I need to index a whole lot of webpages, what good webcrawler utilities are there? I'm preferably after something that .NET can talk to, but that's not a showstopper.
What I really need is something that I can give a site url to & it will follow every link and store the content for indexing.
HTTrack -- http://www.httrack.com/ -- is a very good Website copier. Works pretty good. Have been using it for a long time.
Nutch is a web crawler(crawler is the type of program you're looking for) -- http://lucene.apache.org/nutch/ -- which uses a top notch search utility lucene.