XML Web Services And Character Sets

When developing web services that serve XML, it’s important to get the MIME type right as the wrong one could decoding errors due to default character sets.

text/xml defaults to sending your XML document in the us-ascii character set. See section 3.1 of RFC 3032 for the reason why.

application/xml defaults to sending your XML document in the utf-8 character set. See section 8.10 of RFC 3032 and Appendix F of the XML specification for the reasons why.

Of course, you can override these defaults, but it’s worth remembering if your XML application doesn’t behave quite as you were expecting and is rendering odd characters.

Python, XML and iTunes

As I’ve taken the week off work, I thought as well as spending time with my family, I’d brush up my Python skills as they’ve been a bit neglected of late.

I’ve never tried XML parsing with Python so thought I’d cover that. Apple’s iTunes has the ability to export information about your music in XML and I’d been meaning to take a look at that for a while. Why not combine the two, so here’s my take on parsing iTunes export information with Python.

I thought i’d work on a small subset of my library, the ones I’ve actually paid to download from iTunes compared to the ones converted from CD.

Rob's purchased iTunes tracks

The exported XML data is a bit peculiar. I would have assumed it to be values enclosed by sensible tag names e.g <artist>Human League</artist>. However, it’s actually a bunch of neighbouring tags and values like this <key>Artist</key><string>Sheb Wooley</string>

Here’s a snippet from the actual data export I ran…

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Major Version</key><integer>1</integer>
<key>Minor Version</key><integer>1</integer>
<key>Application Version</key><string>6.0.5</string>
<key>Features</key><integer>1</integer>
<key>Music Folder</key><string>file://localhost/D:/Documents%20and%20Settings/Windows%20User/My%20Documents/My%20Music/iTunes/iTunes%20Music/</string>
<key>Library Persistent ID</key><string>C5DD29C89369B278</string>
<key>Tracks</key>
<dict>
<key>312</key>
<dict>
<key>Track ID</key><integer>312</integer>
<key>Name</key><string>The Purple People Eater</string>
<key>Artist</key><string>Sheb Wooley</string>
<key>Album</key><string>20th Century Rocks: 50's Rock 'n Roll - At the Hop</string>
<key>Genre</key><string>Pop</string>
<key>Kind</key><string>Protected AAC audio file</string>
<key>Size</key><integer>2260837</integer>
<key>Total Time</key><integer>135533</integer>
<key>Disc Number</key><integer>1</integer>
<key>Disc Count</key><integer>1</integer>
<key>Track Number</key><integer>5</integer>
<key>Year</key><integer>2001</integer>
<key>Date Modified</key><date>2006-09-28T09:54:23Z</date>
<key>Date Added</key><date>2006-09-28T09:54:10Z</date>
<key>Bit Rate</key><integer>128</integer>
<key>Sample Rate</key><integer>44100</integer>
<key>Play Count</key><integer>11</integer>
<key>Play Date</key><integer>-1042489964</integer>
<key>Play Date UTC</key><date>2007-01-24T08:55:32Z</date>
<key>Normalization</key><integer>7764</integer>
<key>Compilation</key><true/>
<key>Artwork Count</key><integer>1</integer>
<key>Persistent ID</key><string>302B45E87F01479F</string>
<key>Track Type</key><string>File</string>
<key>Protected</key><true/>
<key>Location</key><string>file://localhost/D:/Documents%20and%20Settings/Windows%20User/My%20Documents/My%20Music/iTunes/iTunes%20Music/Compilations/20th%20Century%20Rocks_%2050's%20Rock%20'n%20Roll%20-/05%20The%20Purple%20People%20Eater.m4p</string>
<key>File Folder Count</key><integer>4</integer>
<key>Library Folder Count</key><integer>1</integer>
</dict>
<key>313</key>
<dict>
<key>Track ID</key><integer>313</integer>
<key>Name</key><string>Daisy Daisy</string>
<key>Artist</key><string>Johnny O'Tolle &#38; His Naughty Band</string>
<key>Album</key><string>Gay 90's</string>
<key>Genre</key><string>Vocal</string>
<key>Kind</key><string>Protected AAC audio file</string>
<key>Size</key><integer>2346412</integer>
<key>Total Time</key><integer>125084</integer>
<key>Disc Number</key><integer>1</integer>
<key>Disc Count</key><integer>1</integer>
<key>Track Number</key><integer>2</integer>
<key>Track Count</key><integer>10</integer>
<key>Year</key><integer>2006</integer>
<key>Date Modified</key><date>2006-09-28T09:59:52Z</date>
<key>Date Added</key><date>2006-09-28T09:59:38Z</date>
<key>Bit Rate</key><integer>128</integer>
<key>Sample Rate</key><integer>44100</integer>
<key>Play Count</key><integer>6</integer>
<key>Play Date</key><integer>-1038647848</integer>
<key>Play Date UTC</key><date>2007-03-09T20:10:48Z</date>
<key>Artwork Count</key><integer>1</integer>
<key>Persistent ID</key><string>302B45E87F01490F</string>
<key>Track Type</key><string>File</string>
<key>Protected</key><true/>
<key>Location</key><string>file://localhost/D:/Documents%20and%20Settings/Windows%20User/My%20Documents/My%20Music/iTunes/iTunes%20Music/Johnny%20O'Tolle%20&#38;%20His%20Naughty%20Band/Gay%2090's/02%20Daisy%20Daisy.m4p</string>
<key>File Folder Count</key><integer>4</integer>
<key>Library Folder Count</key><integer>1</integer>
</dict>

This makes parsing the data a bit trickier than I had hoped for. I was hoping to use a nice simple XPath expression, but data like this looks like it’s more a job for a SAX based approach.

I took a look in O’Reilly’s excellent Programming Python, and found a nice SAX parser example to modify.

As it’s just a quick test, I’m making a few assumptions on the XML data that a production system would have to handle. In this case, I’m assume a tag order of Track ID, Name and Artist. Using this order, each time we see one of those tags come past, we can make up a Track object and store the relevant data. In this case, when we see Track ID we need a new Track object to store the data in. When we see Name, we store the track name in the object and when we see Artist we save the artist, push the Track object to our list of Tracks and clear the current Track object.

That’s a bit long winded, so here’s the code.

import xml.sax.handler
class ITunesHandler(xml.sax.handler.ContentHandler):
def __init__(self):
self.parsing_tag = False
self.tag = ''
self.value = ''
self.tracks = []
self.track = None
def startElement(self, name, attributes):
if name == 'key':
self.parsing_tag = True
def characters(self, data):
if self.parsing_tag:
self.tag = data
self.value = ''
else:
# could be multiple lines, so append data.
self.value = self.value + data
def endElement(self,name):
if name == 'key':
self.parsing_tag = False
else:
if self.tag == 'Track ID':
# start of a new track, so a new object
# is needed.
self.track = Track()
elif self.tag == 'Name' and self.track:
self.track.track = self.value
elif self.tag == 'Artist' and self.track:
self.track.artist = self.value
# assume this is all the data we need
# so append the track object to our list
# and reset our track object to None.
self.tracks.append(self.track)
self.track = None
class Track:
def __init__(self):
self.track = ''
self.artist = ''
def __str__(self):
return "Track = %snArtist = %s" % (self.track,self.artist)

In the real world, the Track class would offer a lot more functionality, in this case, it’s just for holding data and providing a pretty printer.

Now we need to parse the XML and display the results, here’s the code…

parser = xml.sax.make_parser()
handler = ITunesHandler()
parser.setContentHandler(handler)
parser.parse('D:\Documents and Settings\Windows User\Desktop\Purchased.xml')
for track in handler.tracks:
print track

Let’s run that code and see what we get…

Track = The Purple People Eater
Artist = Sheb Wooley
Track = Daisy Daisy
Artist = Johnny O'Tolle & His Naughty Band
Track = Don't Dilly Dally
Artist = Kidzone
Track = Jump In My Car
Artist = David Hasselhoff
Track = Puff, the Magic Dragon
Artist = Peter, Paul And Mary
Track = You Give Love a Bad Name
Artist = Bon Jovi
Track = Heart of Glass
Artist = Blondie
Track = Grace Kelly
Artist = Mika
Track = Standing In the Way of Control
Artist = Gossip
Track = Physical
Artist = Olivia Newton-John
Track = Don't You Want Me
Artist = The Human League
Track = Have a Drink On Me
Artist = Lonnie Donegan
Track = My Old Man's a Dustman
Artist = Lonnie Donegan

That’s great! OK, I’m not going to win any awards for my taste in music, but at least I can now think about building music services that use this data.

Using Twitter From Perl

The world and his dog is currently looking at Twitter and eyeing up the possibilities it offers.

I thought i’d jump on the bandwagon, and have a look at the Twitter API.

I wanted to post to a timeline, so the solution is to use one of the update methods. I chose to the XML one, though there is a JSON one also available.

To post to the timeline, Twitter expects an HTTP POST request with a status parameter containing the message you want to post. It associates this to your account by using HTTP’s basic authorization functionality.

It’s simple to throw together a spot of Perl code to post messages to Twitter knowing this. Have a look at this example…


my $message = "A test post from Perl";
my $req = HTTP::Request->new(POST =>'http://twitter.com/statuses/update.xml');
$req->content_type('application/x-www-form-urlencoded');
$req->content('status=' . $message);
$req->authorization_basic($username, $password);
my $resp = $ua->request($req);
my $content = $resp->content;
print $content;

You need to set $username and $password to your username and password, and $message to whatever message you want to appear on your timeline (in this case, “A test post from Perl”).

Using SOAP::Lite With Perl

I’ve been trying to access a SOAP service using Perl for a project, so I took a quick look at CPAN to see what was available. The answer seemed to be to use SOAP::Lite.

This module turned out to be a real devil to use, and my problems with it were probably compounded by my lack of SOAP experience.

SOAP (incase you didn’t know) is a method of accessing remote services using XML. In my case, I was trying to access content supplied by a third party ringtone provider using SOAP over HTTP.

Using SOAP::Lite you have to specify a uri and a proxy. What stumped me for a while is that the proxy is the URI of service you wish to access, and uri is the namespace of the service.

So if the provider I was using had their SOAP service available at ws.robstones-services.co.uk/external.asmx, I would use this as the proxy, and in this case I would use ws.robstones-services.co.uk/External as the urinamespace.

Great, I knew the service I needed was called getCallList, and required a Username and Password to be passed to it. In return it would give me a list of valid content types. The SOAPAction header to be added to the call to proxy, letting the remote SOAP service I wanted to use the service http://ws.robstones-services.co.uk/External/getCallList.

My first attempt was the following code…

#!/usr/bin/perl -w
use strict;
use SOAP::Lite 'trace', 'debug';
my $server = SOAP::Lite
->uri('http://ws.robstones-services.co.uk/External')
->proxy('http://ws.robstones-services.co.uk/external.asmx');
my $returned = $server
->getCallList({
'Username' => 'RobsUser',
'Password' => 'RobsPassword'
});
foreach my $type ($returned->valueof('//getCallListResult/string')) {
next unless ($type); ## ignore any undefs
print "$typen";
}

You’ll notice I’m using trace and debug to see the SOAP messages being sent and received. The dialogue from this script was…

> perl -w test.pl
SOAP::Transport::HTTP::Client::send_receive: POST http://ws.robstones-services.co.uk/external.asmx HTTP/1.1
Accept: text/xml
Accept: multipart/*
Content-Length: 615
Content-Type: text/xml; charset=utf-8
SOAPAction: "http://ws.robstones-services.co.uk/External#getCallList"
<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsd="http://www.w3.org/1999/XMLSchema" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Body><namesp1:getCallList xmlns:namesp1="http://ws.robstones-services.co.uk/External"><c-gensym3><Username xsi:type="xsd:string">RobsUser</Username><Password xsi:type="xsd:string">RobsPassword</Password></c-gensym3></namesp1:getCallList></SOAP-ENV:Body></SOAP-ENV:Envelope>
SOAP::Transport::HTTP::Client::send_receive: HTTP/1.1 500 (Internal Server Error) Internal Server Error.
Cache-Control: private
Connection: close
Date: Tue, 29 Nov 2005 15:23:38 GMT
Server: Microsoft-IIS/6.0
Content-Length: 508
Content-Type: text/xml; charset=utf-8
Client-Date: Tue, 29 Nov 2005 15:23:09 GMT
Client-Response-Num: 1
X-AspNet-Version: 1.1.4322
X-Powered-By: ASP.NET
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<soap:Fault>
<faultcode>soap:Client</faultcode>
<faultstring>Server did not recognize the value of HTTP Header SOAPAction: http://ws.robstones-services.co.uk/External#getCallList.</faultstring>
<detail />
</soap:Fault>
</soap:Body>
</soap:Envelope>

The call didn’t work, you’ll see the SOAPAction variable is wrong, it’s http://ws.robstones-services.co.uk/External#getCallList instead of http://ws.robstones-services.co.uk/External/getCallList.

I needed to get SOAP::Lite to use the correct URL. The secret turned out to be to change the use SOAP::Lite line to the following.

use SOAP::Lite on_action => sub {sprintf '%s/%s', @_},
'trace', 'debug';

on_action is a parameter SOAP::Lite uses to separate the URI from the action, here we’re telling it to use /.

This gave me the following SOAP dialogue…

> perl -w test.pl
SOAP::Transport::HTTP::Client::send_receive: POST http://ws.robstones-services.co.uk/external.asmx HTTP/1.1
Accept: text/xml
Accept: multipart/*
Content-Length: 615
Content-Type: text/xml; charset=utf-8
SOAPAction: http://ws.robstones-services.co.uk/External/getCallList
<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsd="http://www.w3.org/1999/XMLSchema" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Body><namesp1:getCallList xmlns:namesp1="http://ws.robstones-services.co.uk/External"><c-gensym3><Username xsi:type="xsd:string">RobsUser</Username><Password xsi:type="xsd:string">RobsPassword</Password></c-gensym3></namesp1:getCallList></SOAP-ENV:Body></SOAP-ENV:Envelope>
SOAP::Transport::HTTP::Client::send_receive: HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Connection: close
Date: Tue, 29 Nov 2005 15:22:53 GMT
Server: Microsoft-IIS/6.0
Content-Length: 405
Content-Type: text/xml; charset=utf-8
Client-Date: Tue, 29 Nov 2005 15:22:25 GMT
Client-Response-Num: 1
X-AspNet-Version: 1.1.4322
X-Powered-By: ASP.NET
<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><getCallListResponse xmlns="http://ws.robstones-services.co.uk/External"><getCallListResult><string xsi:nil="true" /></getCallListResult></getCallListResponse></soap:Body></soap:Envelope>

Bugger, it still didn’t work. I looked again at the SOAP message being sent. It was sending a lot of namespace information and a wrapper around the Username and Password. The wrapper was definitely wrong, and the remote .NET service didn’t seem to like the namespaces.

I needed to sort that out, so I had to tell SOAP::Lite to not to include namespaces or wrap up the Username and Password. The way to do this is to use SOAP::Data to hardcode the data being sent.

This meant changing the call to getCallList to the following…

my $returned = $server
->getCallList(
SOAP::Data->name('Username')->value('RobsUser')->type(''),
SOAP::Data->name('Password')->value('RobsPassword')->type('')
);

The SOAP dialogue after these changes looked like this..

SOAP::Transport::HTTP::Client::send_receive: POST http://ws.robstones-services.co.uk/external.asmx HTTP/1.1
Accept: text/xml
Accept: multipart/*
Content-Length: 548
Content-Type: text/xml; charset=utf-8
SOAPAction: http://ws.robstones-services.co.uk/External/getCallList
<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsd="http://www.w3.org/1999/XMLSchema" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Body><namesp1:getCallList xmlns:namesp1="http://ws.robstones-services.co.uk/External"><Username>RobsUser</Username><Password>RobsPassword</Password></namesp1:getCallList></SOAP-ENV:Body></SOAP-ENV:Envelope>
SOAP::Transport::HTTP::Client::send_receive: HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Connection: close
Date: Tue, 29 Nov 2005 15:19:49 GMT
Server: Microsoft-IIS/6.0
Content-Length: 405
Content-Type: text/xml; charset=utf-8
Client-Date: Tue, 29 Nov 2005 15:19:21 GMT
Client-Response-Num: 1
X-AspNet-Version: 1.1.4322
X-Powered-By: ASP.NET
<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><getCallListResponse xmlns="http://ws.robstones-services.co.uk/External"><getCallListResult><string xsi:nil="true" /></getCallListResult></getCallListResponse></soap:Body></soap:Envelope>

Close, but still not right. The problem was with the getCallList itself, it was still trying to use a namespace.

The final solution was to this problem was to not use getCallList, but instead to use SOAP::Lite’s call method to implicitly set how the function was being called.

The code after making this change was the following…

my $returned = $server
->call(SOAP::Data->name('getCallList')->attr({xmlns => 'http://ws.robstones-services.co.uk/External'}) =>
SOAP::Data->name('Username')->value('RobsUser')->type(''),
SOAP::Data->name('Password')->value('RobsPassword')->type('')
);

This gave me the SOAP dialogue…

> perl -w test_soap.pl
SOAP::Transport::HTTP::Client::send_receive: POST http://ws.robstones-services.co.uk/external.asmx HTTP/1.1
Accept: text/xml
Accept: multipart/*
Content-Length: 524
Content-Type: text/xml; charset=utf-8
SOAPAction: http://ws.robstones-services.co.uk/External/getCallList
<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsd="http://www.w3.org/1999/XMLSchema" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Body><getCallList xmlns="http://ws.robstones-services.co.uk/External"><Username>RobsUser</Username><Password>RobsPassword</Password></getCallList></SOAP-ENV:Body></SOAP-ENV:Envelope>
SOAP::Transport::HTTP::Client::send_receive: HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Connection: close
Date: Tue, 29 Nov 2005 15:16:07 GMT
Server: Microsoft-IIS/6.0
Content-Length: 540
Content-Type: text/xml; charset=utf-8
Client-Date: Tue, 29 Nov 2005 15:15:39 GMT
Client-Response-Num: 1
X-AspNet-Version: 1.1.4322
X-Powered-By: ASP.NET
<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><getCallListResponse xmlns="http://ws.robstones-services.co.uk/External"><getCallListResult><string>ROBS_JAVA</string><string>ROBS_WALL</string><string>ROBS_POLY</string><string>ROBS_MONO</string><string>ROBS_REAL</string><string xsi:nil="true" /></getCallListResult></getCallListResponse></soap:Body></soap:Envelope>
ROBS_JAVA
ROBS_WALL
ROBS_POLY
ROBS_MONO
ROBS_REAL

Hurrah, after all that effort, I have a list of 5 types of content offered by my third party supplier. I can now go ahead and build my service.

SOAP::Lite is a powerful module, but the lack of simple, easy to follow documentation and examples holds it back. I hope this small article helps out other programmers just starting on the SOAP path.

Using del.icio.us Bookmarks With Perl

Earlier this week I put live some Perl code that took bookmarks I was posting to del.icio.us and added them automatically to my blog.

I’ve had a few people ask how I was able to do this, and it’s no big secret.

del.icio.us expose an API that anyone can use to interact with their service. I’m using this with some Perl glue to aggregate the previous days posts and put them up on my own site.

The function I’m using is GET, which can take a date as an optional parameter. I use this parameter to get all postings from yesterday. So if the date today is 25th November 2005, to get yesterdays posts from del.icio.us I call the URL http://del.icio.us/api/posts/get?dt=2005-11-24.

All API calls to del.icio.us use HTTP-Auth, these are your site login and password.

To access this data using perl, we can use the LWP modules from CPAN.

my $ua = LWP::UserAgent->new;
$ua->agent('robslinkbot/0.1 (+http://www.robertprice.co.uk)');
my $get_url = 'http://del.icio.us/api/posts/get?dt=' . $yesterday;
my $req = HTTP::Request->new(GET => $get_url);
$req->authorization_basic(USERNAME, PASSWORD);
my $res = $ua->request($req);
if ($res->is_success) {
my $xml = $res->content;
} else {
warn "unable to get content from del.icio.usn";
}

In this case USERNAME and PASSWORD are both two constant values with my username and password defined in. You’ll need to put your own details in here.

You’ll also notice that I’m setting the agent to be robslinkbot/0.1 (+http://www.robertprice.co.uk). del.icio.us specicially request that a user-agent is set as they tend to ban generic user-agents from time to time. If I didn’t set this, it would be set to something like lwp-perl.

So now we have this code, and if it’s worked correctly, I should have my posts from the previous day in the variable $xml, and if it’s not worked, I should have seen a warning informing me that the script was unable to get content from del.icio.us.

We can parse the XML provided very easily using one of Perl’s many XML modules. In this case, I’m going to use XML::XPath.

As we’ll have multiple bookmarks (hopefully), we’ll create a list of hashrefs. The hashrefs will contain the information relating to the post.

Ok, so firstly we create a variable called @posts to store the hashrefs in.

my @posts;

Now we can create our XML::XPath object, and get it to parse the xml we’ve already downloaded from del.icio.us and have in the $xml variable.

my $xp = XML::XPath->new(xml => $xml);

We need to see if the XPath /posts/post exists, if it does it means we have posts to parse.

if ($xp->exists('/posts/post')) {

Now we have to find al the posts and iterate over them

foreach my $posts ($xp->find('//post')->get_nodelist) {

Now all we do is to extract the information we need from each post, store it in our hashref and finally store it in the @posts list.

my $post_hashref;
## we use .= to stringify the find value, must be a better way to do that.
$post_hashref->{'href'} .= $posts->find('@href');
$post_hashref->{'description'} .= $posts->find('@description');
$post_hashref->{'time'} .= $posts->find('@time');
$post_hashref->{'hash'} .= $posts->find('@hash');
$post_hashref->{'extended'} .= $posts->find('@extended');
my @tags = split(/ /, $posts->find('@tag'));
$post_hashref->{'tags'} = @tags;
push @posts, $post_hashref;

You may have noticed that we have split the tags and stored them as a list. I just find this easier to work with.

And that is about it. You should have list you can iterate over, or pass to something like Template Toolkit for displaying. This is a process I use on robertprice.co.uk.