robots.txt parser java

zahir hussain  picture zahir hussain · Jun 29, 2010 · Viewed 7.7k times · Source

I want to know how to parse the robots.txt in java.

Is there already any code?

Answer

Bill the Lizard picture Bill the Lizard · Jun 29, 2010

Heritrix is an open-source web crawler written in Java. Looking through their javadoc, I see that they have a utility class Robotstxt for parsing the robots.txt file.