Earlier this week I put live some Perl code that took bookmarks I was posting to del.icio.us and added them automatically to my blog.
I’ve had a few people ask how I was able to do this, and it’s no big secret.
del.icio.us expose an API that anyone can use to interact with their service. I’m using this with some Perl glue to aggregate the previous days posts and put them up on my own site.
The function I’m using is GET, which can take a date as an optional parameter. I use this parameter to get all postings from yesterday. So if the date today is 25th November 2005, to get yesterdays posts from del.icio.us I call the URL http://del.icio.us/api/posts/get?dt=2005-11-24.
All API calls to del.icio.us use HTTP-Auth, these are your site login and password.
To access this data using perl, we can use the LWP modules from CPAN.
my $ua = LWP::UserAgent->new;
$ua->agent('robslinkbot/0.1 (+http://www.robertprice.co.uk)');
my $get_url = 'http://del.icio.us/api/posts/get?dt=' . $yesterday;
my $req = HTTP::Request->new(GET => $get_url);
$req->authorization_basic(USERNAME, PASSWORD);
my $res = $ua->request($req);
if ($res->is_success) {
my $xml = $res->content;
} else {
warn "unable to get content from del.icio.usn";
}
In this case USERNAME
and PASSWORD
are both two constant values with my username and password defined in. You’ll need to put your own details in here.
You’ll also notice that I’m setting the agent to be robslinkbot/0.1 (+http://www.robertprice.co.uk)
. del.icio.us specicially request that a user-agent is set as they tend to ban generic user-agents from time to time. If I didn’t set this, it would be set to something like lwp-perl
.
So now we have this code, and if it’s worked correctly, I should have my posts from the previous day in the variable $xml, and if it’s not worked, I should have seen a warning informing me that the script was unable to get content from del.icio.us.
We can parse the XML provided very easily using one of Perl’s many XML modules. In this case, I’m going to use XML::XPath
.
As we’ll have multiple bookmarks (hopefully), we’ll create a list of hashrefs. The hashrefs will contain the information relating to the post.
Ok, so firstly we create a variable called @posts to store the hashrefs in.
my @posts;
Now we can create our XML::XPath object, and get it to parse the xml we’ve already downloaded from del.icio.us and have in the $xml variable.
my $xp = XML::XPath->new(xml => $xml);
We need to see if the XPath /posts/post exists, if it does it means we have posts to parse.
if ($xp->exists('/posts/post')) {
Now we have to find al the posts and iterate over them
foreach my $posts ($xp->find('//post')->get_nodelist) {
Now all we do is to extract the information we need from each post, store it in our hashref and finally store it in the @posts list.
my $post_hashref;
## we use .= to stringify the find value, must be a better way to do that.
$post_hashref->{'href'} .= $posts->find('@href');
$post_hashref->{'description'} .= $posts->find('@description');
$post_hashref->{'time'} .= $posts->find('@time');
$post_hashref->{'hash'} .= $posts->find('@hash');
$post_hashref->{'extended'} .= $posts->find('@extended');
my @tags = split(/ /, $posts->find('@tag'));
$post_hashref->{'tags'} = @tags;
push @posts, $post_hashref;
You may have noticed that we have split the tags and stored them as a list. I just find this easier to work with.
And that is about it. You should have list you can iterate over, or pass to something like Template Toolkit for displaying. This is a process I use on robertprice.co.uk.