3/02/2009

URLs restricted by robots.txt - Help!

I tried to submit my blog's site map using Google's webmaster tools and I received a report that 74 of my posts/URLs were restricted by robots.txt. Can anyone help me with this? This is obviously killing my web traffic, but I have no idea what I'm doing wrong.... HELP please!

5 comments:

firephish said...

Your robots file looks normal:

User-agent: Mediapartners-Google
Disallow:

User-agent: *
Disallow: /search

Sitemap: http://owlbox.blogspot.com/feeds/posts/default?orderby=updated


The "Disallow: /search" entry ensures labels on blog posts are not indexed ... so that is possibly what is being reported on?

Owlman said...

Thanks Firephish,

I submitted my atom.xml as my site map i.e.http://www.owlbox.blogspot.com/atom.xml. When I looked this morning this was the summary I got:

Sitemap type: Web
Format: RSS Feed
Submitted: 8 hours ago
Last downloaded by Google: 6 hours ago
Status: Errors
Total URLs in Sitemap: 1
Indexed URLs in Sitemap: 0

Does this mean that my sitemap is not being indexed because of the errors? The summary page shows 10 errors, 9 of these errors are "URL not allowed
This url is not allowed for a Sitemap at this location" and the remaining error is "Invalid XML: too many tags
Too many tags describing this tag. Parent tag: channel link".

It says I should fix the problem/s and resubmit, but honestly I don't know where to start with this problem.

John said...

Since Blogger is a Google product, it would seem that the contents of robots.txt would have been set by Google. So it ought to be set correctly. I'm not sure about the tag errors.

firephish said...

ah, i think you should be submitting the following as your sitemap:

http://owlbox.blogspot.com/feeds/posts/default?orderby=updated

Owlman said...

Hi John, Firephish is correct that the errors are related to the labels and it shouldn’t cause any concern.
More thanks Firephish. I tried the url you provided but for some reason it threw an error. I then tried www.owlbox.blogspot/feeds/posts/default?redirect=false and that worked. Based on my research the redirect false is necessary if you have a Feedburner account linked. I also submitted my Atom link with the same code added i.e. www.owlbox.blogspot/atom.xml?redirect=false and that was successful. No errors are shown on the two sitemaps although I’m a little unclear why only 26 urls have been picked up on each Sitemap, when I have a LOT of posts on my blog. I’m assuming that it is just because it wasn’t a full Google scan…..

As you can tell I’m digging into Search engine optimization at the moment and I just figured out how to add Alt tags to my images – something that I’ll be using henceforth. Thanks for popping in and offering knowledgeable advice on this topic. Now I just need to figure out whether it is possible to do a 301 redirect from owlbox.blogspot.com to http://www.owlbox.blogspot.com using Blogger.

Incredible photos shared by the Flickr community group - Owls of North America. Click on the play button to begin the slideshow - ENJOY!