|
|
|
|||
|
|
What is a robots.txt file?
User-agent: * Allow: /searchhistory/ Disallow: /news?output=xhtml& Allow: /news?output=xhtml Disallow: /search Disallow: /groups Disallow: /images Disallow: /catalogs Disallow: /catalogues Disallow: /news
How to check your robots.txt fileOpen your web browser and enter www.yourdomain.com/robots.txt to view the contents of your robots txt file. Here are the most important tips for a correct robots.txt file:
1. There are only two official commands for the robots.txt file: User-agent and Disallow. Do not use more commands than these.
2. Don't change the order of the commands. Start with the user-agent line
and then add the disallow commands:
3. Don't use more than one directory in a Disallow line. "Disallow:
/support /cgi-bin/ /images/" does not work. Use an extra Disallow line for
every directory: 4. Be sure to use the right case. The file names on your server are case sensitve. If the name of your directory is "Support", don't write "support" in the robots.txt file. You can find user agent names in your log files by checking for requests to robots.txt. Usually, all search engine spiders should be given the same rights. To do that, use User-agent: * in your robots.txt file. What happens if you don't have a robots.txt file? If your Web site doesn't have a robots.txt file (you can check this by entering your www.yourdomain.com/robotx.txt in your web browser) then search engines will automatically index everything they can find on your site. | ||