The fact that Blogger uses the Google Data API opens up all kinds of possiblities. I was browsing the Blogger Data API documentation and noticed that there Zend has created a PHP client for Google Data. I'm pretty comfortable with PHP, and PHP 5 has a good DOM interface, so I thought I'd take a look.
Zend's framework code isn't so far along (they're still not up to their 1.0 release), but it was enough to get me started. I was able to create a script that went through all my Blogger posts in month-by-month chunks (my other blog has over 200 posts, spread out over 2 years) and saved a local copy of the XML feed for each post. I wanted to have a back up of them before I attempted any scripted changes, and it turns out that was a very good thing.
Next, I modified my script so it could update the posts. My goal: converting my del.icio.us tags to the new Blogger style labels. There are a lot of posts on my blog, and a lot of tags (it's a book-blog, and since we're tagging authors there are a lot of unique or infrequently used tags), so it would have been major work to retag all of the posts by hand. Here I discovered one of the limitations in Zend's implementation-- there's no update function for their Blogger/GData classes. However, it was easy enough to add a quick method that would do a PUT instead of a POST. The other snag I ran into: I was using the DOM to pull the del.icio.us tags out of my posts (which worked okay) and then remove the obsolete div of del.icio.us tags from the post entirely. What I discovered is that not all of my posts are well-formed XML (I think the most common culprits seemed to be unclosed <br> tags), and this resulted in some problems. I won't go into the details, but let's just say I was very glad I'd made a local backup of all my posts before I attempted anything.
I hope to clean up my script to make it work a little better (so that I can finish re-tagging my posts without destroying any of them!). If there's any interest, I might be willing to post the relevant code or upload a script somewhere so that others could benefit from it.
5 comments:
Hi Larq,
This is a really neat use case and has some great pointers for others using the Blogger GData API.
FYI - The Zend Framework classes for GData do actually support updating entries. After retrieving an individual entry and making the appropriate changes, you just need to call $entry->save().
Take a look at Zend_Feed_EntryAtom::save() for more info.
-Ryan (Google)
Thanks for the tip! I couldn't find as much documentation on the object model as I would have liked (or maybe I didn't dig hard enough?). The examples were pretty sparse. That looks like a sweet way to do it, since each entry has its own save url encoded within it.
Thanks for the commentary - we are definitely looking at providing more examples from the Google side. Also, I did look through the Zend documentation an didn't find a reference to updating entries. I will mention that to the Zend folks.
Thanks again,
-Ryan
Hey larq, nice tip. I'm also doing some work with the Zend Gdata/blogger API and currently i'm trying to set the atom published date-time information. Do you know if it's possible to set it?
Looking at some of the posts I downloaded from my blog, there is a published tag at the top, right after the id tag. It looks like the Zend_Gdata_Entry object has a mapping for that, so you should be able to set published and updated just like any of the other fields.
FYI, the date/time format looks to be W3C; the example I downloaded from Blogger looks like this: 2007-03-02T05:00:00.000-05:00
Post a Comment