Thread: Robot.txt
View Single Post
  #1 (permalink)  
Old 22nd June 2006, 10:22 AM
sourabhweb sourabhweb is offline
WD Addict Poster
 
Join Date: 21st June 2006
Posts: 167
Default Robot.txt

The method used to exclude robots from a server is to create a file on the server which specifies an access for robots. This file must be accessible via HTTP .

This method works fine because it can be easily implemented on any WWW server.

Points to be remembered:

The filename should fit in file naming restrictions of all common operating systems.
The filename extension should not require extra server configuration.
The filename should indicate the purpose of the file and be easy to remember.
The likelihood of a clash with existing files should be minimal.


The Format:The format and semantics of the "/robots.txt" file are as follows:
The file consists of one or more records separated by one or more blank lines (terminated by CR,CR/NL, or NL). Each record contains lines of the form "<field>:<optionalspace><value><optionalspace> ". The field name is case insensitive.

Comments can be included in file using UNIX bourne shell conventions: the '#' character is used to indicate that preceding space (if any) and the remainder of the line up to the line termination is discarded. Lines containing only a comment are discarded completely, and therefore do not indicate a record boundary.
Reply With Quote