How to allow crawlers access to index.php only, using robots.txt?

todd picture todd · Oct 28, 2009 · Viewed 14.4k times · Source

If i want to only allow crawlers to access index.php, will this work?

User-agent: *
Disallow: /
Allow: /index.php

Answer

Simone Carletti picture Simone Carletti · Oct 30, 2009

Yes, it will work. Here's the test result from the Google Webmaster Tool.

Url
http://www.example.org/index.php

Googlebot
Allowed by line 3: Allow: /index.php

Googlebot-Mobile
Allowed by line 3: Allow: /index.php

However, remember that with this configuration your site homepage won't be crawled unless the page is accessed with the full qualified path. In other words, http://www.example.org/ is forbidden while http://www.example.org/index.php is allowed.

If you want your homepage to be accessible, here's a better version of your file.

User-agent: *
Disallow: /
Allow: /index.php
Allow: /$