Google ignores the meta robots noindex tag.
How many people have noticed the many thousands of Supplemental pages with a 2005 June or July cache date that have been indexed, show in the SERPs with a full title and description, rank, and have a <meta name="robots" content="noindex"> tag both on the live page and in the old cached copy linked from the SERPs.
Oh yeah! There are thousands of them. Now that is a programming bug.
Obviously when you place the <meta name="robots" content="noindex"> tag on a page, you expect that the page will be spidered, but not indexed. But if Google keeps no data about that URL, they will never "remember" anything about that URL, and will return every few minutes to "discover" the content, again and again.
However, when you think about it, Google must keep a copy of that page internally so that they can tell when the content has changed, and so that they know where that page links to, and so that they know the index/noindex status of the page.
Such a page should never appear in search results, ever. Well, now they do, and in very large numbers; all (so far) marked as Supplemental Results, and with cache dates from a year ago.
It appears that something in their system is forgetting to check the index/noindex status of the pages in their database and is showing them all in the SERPs whatever their status.
I first noticed this yesterday; but found it on some searches that I have not done for several months. I have no idea how long this bug has been showing up... it could be several months.
|