How do I tell search engines not to index content via secondary domain names?

shipshape picture shipshape · Aug 17, 2010 · Viewed 12.9k times · Source

I have a website at a.com (for example). I also have a couple of other domain names which I am not using for anything: b.com and c.com. They currently forward to a.com. I have noticed that Google is indexing content from my site using b.com/stuff and c.com/stuff, not just a.com/stuff. What is the proper way to tell Google to only index content via a.com, not b.com and c.com?

It seems as if a 301 redirect via htaccess is the best solution, but I am not sure how to do that. There is only the one htaccess file (each domain does not have its own htaccess file).

b.com and c.com are not meant to be aliases of a.com, they are just other domain names I am reserving for possible future projects.

Answer

Paul Rubel picture Paul Rubel · Aug 17, 2010

robots.txt is the way to tell spiders what to crawl and what to not crawl. If you put the following in the root of your site at /robots.txt:

User-agent: *
Disallow: /

A well-behaved spider will not search any part of your site. Most large sites have a robots.txt, like google

User-agent: *
Disallow: /search
Disallow: /groups
Disallow: /images
Disallow: /news
#and so on ...