1 rtomes Jun 07, 2007 01:43
3 rtomes Jun 07, 2007 02:49
Yes, it is useful info thanks EdB.
Presumably that link is there all the time as it refers only to the blog, not an article. At any one time it would appear to be associated with the last article - I am thinking aloud here - because the offending article was the last one until about a day ago.
Is there a way to tell search engines to not follow the syndication feeds? At present I have it set to do:
<meta name="robots" content="noindex,follow"/>
and I am guessing / hoping there is an instruction to say don't follow the feeds also.
4 rtomes Jun 07, 2007 03:17
I might have found the answer to my problem myself !!!
I have been looking at the page on robots.txt files at
http://www.google.com/support/webmasters/bin/answer.py?answer=40360&topic=8846
and think that this may be a way to tell search engines to not do the feed pages on each blog. Will report again later if I have success (which may take a week or two to establish).
5 rtomes Jun 07, 2007 03:46
This is what I have done. I made a robots.txt file and installed it to my domain main directory (i.e not in your blog directory but your main domain outermost directory). It contains this:
User-Agent: *
Disallow: /b2/index.php/*?tempskin*
which will prevent all search engines (the "*" in the first line) from indexing all feed types (the 2nd "*" in the 2nd line) for all blogs (the 1st "*" in the 2nd line). It should mean that only the real pages are indexed. I was tempted to put "*?*" as there might be other nonsense also but was worried about preventing something useful. Note that it may be that some search engines do not recognise the "*"s in the 2nd line but google does.
Now I just have to sit back and wait to see if it works OK. If it does then I would recommend this to all people who have b2 blogs as it prevents people going to a link from google that does not work for your pages. It means that someone actually did a search that should have found your pages but when they clicked it they got a bookmark request come up - a very confusing thing to happen - and they would probably give up at that point as there is no information for even a quite clever person to get to the page.
6 rtomes Jun 13, 2007 02:02
Well it is 5 days or so later and still google (UK) is sending people to these silly places. I would have thought that they would recognise these type of templates.
7 edb Jun 13, 2007 16:48
I'm not sure but I think it takes quite a while for something to fade away from search engine caches. 5 days doesn't sound like enough time to me. I think your approach is correct, so be a bit more patient. I think!
8 yabba Jun 13, 2007 17:49
You could try adding this to the very top of your feed skins :-
<?php
if( $Hit->referer_type == 'search' ){
header( 'Location: '.regenerate_url( 'tempskin' ) );
exit();
}
?>
¥
9 village_idiot Jun 13, 2007 20:41
10 rtomes Jun 14, 2007 04:07
Thank you EDB Yabba and whoo.
tempskin=_rdf is actually a syndication feed. That doesn't help, but info is info eh?