06.10.2012
by Esa Turtiainen

Today’s lesson: Google wipes ass with your blogger posts if you don’t know what to do.

It seems that the current Google search algorithm is very aggressive in dumping pages that contain some errors. It is very difficult to avoid those mistakes in Blogger.

One way to find out of them is Google webmaster tools. But actually a better interface is in Bing webmaster tools. You have to verify that you own your page by adding something to the page before you are allowed to use these tools.

First you’ll find out that to get your pages really to be indexed you must submit a sitemap of your page. That is simply a list of your pages in certain XML format. You’ll easily find that you can get a sitemap of you blog by submitting the following URL as the sitemap of your blog:

http://yourblog.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500

(If you have a very large blog, you must submit the sitemap in chunks of 500 entries.)

Now you have your pages submitted for indexing. But your page rank sucks.

The enemy #1 seems to be that if you just submitted several duplicates of the same posts. This really drops your quality points below the floor level.

You have to be able to remove /search/, /feeds/, /view/ and *_archive.html from the results. It seems to work that you put the following into the /robots.txt file:

User-agent: \*
Disallow: /search/
Disallow: /feeds/
Disallow: /view/
Disallow: *_archive.html
Sitemap: http://my.blog/atom.xml?redirect_false&start_index=1&max_results=500
Sitemap: http://my.blog/atom.xml?redirect_false&start_index=501&max_results=500

You don’t want that the same articles will be found through the search URL nor feeds-URL. Also, it seems that different views contain errors that search engines do not like.

The archive files ending _archive.html are also something you don’t want to be indexed. /robots.txt should not understand *-notation, but actually it seems to work — at least with Google.

Automatic update of the sitemap can be done by including it to the robot.txt file. (I actually have still doubts if this works.)

If the problem pages are already in the index, you should try to remove them. Or wait couple of months.

Still, this is not enough search engine optimization for most Blogger pages. At least I had plenty of problems that caused the page ranking to stay very low.

One problem seemed to be duplicate descriptions. In HTML it is a good idea to have line:

<meta content='Best blog, ever' name='description'/>

to tell the search engines what is there in your blog. It is a sin not to include it. But it is a mortal sin to have two of those.

It seems that one of my blogs had two of those. In some step of renewing the template (that is a huge mess) the meta was included to two places.

I found several hints to put the meta in the template into IF-clause. This did not work for me. The right solution was to remove the whole meta. It seems to be generated somewhere in the depth of the template code and it must not be included where it logically belongs!

And the final hint: add alt texts to your pictures. It also rises your page rank.