How To Properly Use Robots.txt
Robots.txt is a specific file that can be placed in any folder of a website to dictate the interaction that specific website has with search engine robots and spiders. Using the robots.txt file to do such is part of the robots exclusion standard (also known as the robots exclusion protocol). Robots.txt is not mandatory for search engine robots to follow, but in general practice, they do follow them to set a standard.
The following are a list of commands that can be set using the robots.txt protocol:
- Allow all robots to view all files
- User-agent: *
Disallow:
- User-agent: *
- Keep all robots from viewing all files
- User-agent: *
Disallow: /
- User-agent: *
- Keep all robots from viewing a specific directory
- User-agent: *
Disallow: /specific_directory/
- User-agent: *
- Keep a specific robot from viewing a specific directory
- User-agent: Robot_Name
Disallow: /specific_directory/
- User-agent: Robot_Name
- Keep all robots from viewing a specific file
- User-agent: *
Disallow: /specific_directory/specific_file.html
- User-agent: *
- Keep a specific robot from viewing a specific file
- User-agent: Robot_Name
Disallow: /specific_directory/specific_file.html
- User-agent: Robot_Name
- Set crawl-delay parameter for all robots
- User-agent: *
Crawl-delay: (Enter time here in seconds)
- User-agent: *
- Set crawl-delay parameter for specific robots
- User-agent: Robot_Name
Crawl-delay: (Enter time here in seconds)
- User-agent: Robot_Name
Place any of these combinations into your robots.txt file and save them to your folder. No need to do anything else, but make sure you set one per domain/subdomain.
Utilizing the robots.txt file in a restrictive way allows you to prevent unneeded pages from being indexed. This prevents the transfer of page rank and link juice to pages such as your about us or terms of service. It can also be used to prevent forums that are private from being indexed and showing up in the search engine results for keywords relevant in the posts.

Comments
No comments yet.