The tag
==============
tags can only appear in the
element of a page.
(Note that whereas the element holds the CONTENTS OF the page,
the element holds information ABOUT the page: this can usefully
include guidelines for "how to index/catalogue this page")
Each tag is self-contained: the two attributes are inside the
angle-brackets, and there is no end-tag required.
The tag contains just two attributes:
The first is either NAME= or HTTP-EQUIV=
The second is always CONTENT=
(It may also contain a third: LANG= )
HTTP-EQUIV is used whenever the information is relevant to the server,
browser, or transfer protocol they use; if the information does not have
such a relevance, then NAME should be used.
The idea (and the origin of the name META) is that an arbitrary amount of
extra information can be furnished in an open-ended way.
Hence a (contrived) tag
behaves like a (non-existent!) tag
The only NAMEs which are worth considering are those which do have an
accepted (ie W3C-defined) usage. Here are some examples, whose meanings
are largely self-explanatory:
Note the first one, which seems to be new to HTML4.0, and is part of the
ability to specify the character set in use, in anticipation of eventual
use of "Unicode" or ISO-10646. Documents in Latin-1 for English and
Western European languages should use ISO-8859-1; and the Acorn Latin-1
character set is identical to this EXCEPT for ASCII characters 128 to 159
which are NOT defined in ISO Latin 1 (so don't use bullets, "sexed" quote
marks, em-dashes, etc!)
The second one gives an example of the LANG attribute; here "English".
Search Engine Robots
--------------------
Probably the most useful reason for some of these META tags is the
ability to furnish Search Engines with information to enable them to
catalogue your site "correctly" (ie, in the category you would prefer).
In the absence of any other information, they would probably analyse
only the and the first 50 words or so of the ;
but careful choice of the CONTENT associated with the NAME="Description"
and HTTP-EQUIV="Keywords" items can be used to influence them to work
in a far less haphazard way;
but please note that if you put a ridiculously large amount of CONTENT
for the "Keywords" tag, some Search Engines will assume you are trying
to pull a fast one, and reject the site completely!
There is a further attribute which a small but increasing number
of Search Engine "spiders" are beginning to respond to:
where the CONTENT can contain one, or two comma-separated, from the list.
Thus, if you have a site where you would prefer some of the leaf-pages
NOT to be accessed directly out of context, but only via your main index
page, you could but a tag in the part of those leaf pages saying
Or if you did not want ANY of the leaf pages catalogued, then put
in the section of your index.html page.
However, please don't rely on all robots doing what you tell them:
You can't force a robot to catalogue you if it doesn't feel like it;
You can't force a robot to ignore you if it ignores META tags anyway!
Obscure Meta Tags: Refresh
--------------------------
Not surprisingly, there are plenty of instances of "extension" META tag
attributes being invented; one which crops up from time to time (and
is presumably put there to deliberately confuse visitors who are NOT
using the particular commercial browser whose vendors introduced it) is
at least, that's what the W3C HTML4.0 specs say; but it appears that the
form that gets used in practice, and more likely to be understood, is
The intention here is that the index.htm page containing that tag
will be displayed for a given number of seconds after being fetched,
and then automatically replaced by a further page as referenced; eg
where the single quote marks are optional and best omitted.
W3C have heard of this tag, and strongly recommend that such an index
page should contain a hyperlink to the next page, so that visitors using
a browser that does not implement that extension are not left stuck
looking at an unhelpful "transient" page; but do any of the perpetrators
of this sort of thing remember to do so?
(Off-topic) robots.txt
----------------------
A Web server may contain a file called "robots.txt" in its root, which
can contain instructions to visiting spiders/robots, and allows much more
control than the simpler method above.
However, it ONLY works if it is in the root directory of the server,
and so is NOT applicable to sites within a user directory.
John Alldred
31 January 1999
john@protovale.co.uk
http://www.protovale.co.uk/john/
http://www.argonet.co.uk/users/protovale/john.html