Perl Archives - Page 3 of 4

Using Symbian Perl

I’ve been playing with Jarkko Hietaniemi’s port of Perl for Series 60 Symbian devices on my Nokia 7610.

There is a lot of work to be done on it still, but it works, and I can run simple Perl scripts.

Perl running on a Nokia 7610 phone

The screenshot above show’s me running a simple script to list the contents of the Lifeblog directory on my series 7610 phone. Here’s the code that was run on the phone to generate the listing.



my $directory = "C:\system\Data\lifeblog\";

print "$directoryn";

opendir(DIR,$directory);

foreach my $file (readdir(DIR)) {

print "$filen";

}

closedir(DIR);

sleep(5);

As you can see it’s simple stuff. I added the sleep(5) at the end just to allow me to take a screenshot as when the script has been run it prompts you to press any key to exit. In this case, the screenshot key, caused an exit. 🙂

It may not look much, but it’s bloody exciting stuff for me. The possibility of running decent Perl applications on my phone has me salivating in anticipation of the next release. At present the version out there (0.1.0) doesn’t offer any interface into the native Symbian libraries from Perl, but I’m sure that will be changing soon.

It’s easy to write simple scripts, as all I had to do for this one was to just drop it into the Perl directory on my memory card (via Nokai PC Suite) and use the PerlApp application supplied in the Symbian distribution.

The Python guys are luckier to have a fuller featured release of their favourite scripting language, but watch out, the Perl Mongers are coming!

Lifeblog Proxy Idea

Sitting in a Lifeblog debrief earlier, one thing that struck me was that others had the same problem as me regarding wanting to post to multiple blogs.

It seems most would like to seperate a work blog from a personal blog, but unless it’s hosted on the same Typepad account for example, Lifeblog doesn’t let you do this. From a service point a of view it’s a one to one match.

Posting on Lifeblog

Sitting there, my mind was mulling the problem over, and it would appear that a simple Lifeblog proxy would solve the problem. If blogs are hosted on the same service and accessible by the same username and password, Lifeblog lets you post to different blogs. Why not just build a service that can proxy between various Lifeblog compatible blogs, so you wouldn’t have to host them all together.

Posting on Lifeblog via a proxy

So how may this work from a technical perspective.

Well Lifeblog posts using a flavour of the Atom protocol. For security it uses WSSE encryption on the posts. This means that the proxy would need to it’s own username and password to authenticate against when talking to Lifeblog. The various blogs it would be proxying onto would also need different username and passwords, and proxy would have to insert these as it passes the post onto the relevant blog. We could potentially store all the blogs we’re allowing posts to in an XML config file. For example…

<blogs>

<blog>

<name>My Blog</name>

<url>http://work.blog.com/post.pl</url>

<username>robertprice</username>

<password>secret</password>

</blog>

<blog>

<name>My Blog 2</name>

<url>http://my.website.com/post.pl</url>

<username>rob</username>

<password>lifeblog</password>

</blog>

</blogs>

Here all the blogs are listed, along with their name, posting url, username and password. The proxy would take this list and return a localised list of blogs that when posted to, would just pass the relevant data across. So this means there are two areas to break the proxy down into.

First, the list of blogs. This reads the XML and returns a list of localised blogs and posting URL’s that Lifeblog can use to upload content.

Secondly, the actual localised posting URL needs to remove the Lifeblog WSSE authentication, and replace it with the correct username and password for the real blog before passing it on to the real upload URL.

It could be as simple as that. Maybe I’ll mock something up in Perl to test the theory out.

Anyway, who’s to say this just has to proxy Lifeblog. It could alternatively be a gateway that could translate into one of the common blogging API’s, instantly opening up Lifeblog to millions more users. Now that would be cool!

UPDATE 23/04/05

Hugo emailed me to say Lifeblog 1.6 can handle some of what I have suggested…

Actually, Lifeblog 1.6 can have post to more than one account, and is
available for the Nokia 6630, 6680, 6681, 6682. Unfortunately Lifeblog
1.5 (for 7610, 6670, 6260, 3230) can only post to one blog. And the PC
can post to multiple accounts.

Nokia Release Series 60 Patch For Perl

It looks like Perl on Nokia Series 60 phones is getting closing as Jarkko Hietaniemi has just commited a patch to the Perl 5 Porters mailing list that enables Perl 5.8.x and Perl 5.9.x to work on Symbian smartphones. The message specifically states that it is known to work on Nokia Series 60 phones. The port is copyright Nokia.

I’m now officially very excited! Perl could very soon be running on my Nokia 6630!

A quick delve into the attached README reveals…

The attached patches enable compiling Perl on the Symbian OS platform:
Symbian OS releases 7.0s and 8.0a; and the corresponding Series 60
SDKs 2.0, 2.1, and 2.6.

Note that the patches only implement a “base port”, enabling one to
run Perl on Symbian, the basic operating system platform. The patches
do not implement any further Symbian OS or Series 60 (an application
framework) bindings to Perl. (A small Symbian / Series 60 interface
class and a small Series 60 application are included, though.)

It also seems that the patch allows Perl to be embedded into Series 60 C++ applications.

Since the primary way of using Perl on Symbian is a DLL (as described above),
I also wrote a small wrapper class for Series 60 (C++) applications that
want to embed a Perl interpreter, and a small Series 60 demonstration
application (PerlApp) using that wrapper class. As a bonus PerlApp knows
how to install Perl scripts (.pl, or hash-bang-perl) and Perl modules (.pm)
from the messaging application’s Inbox, and how to run scripts when invoked
via a filebrowser (either the one builtin to PerlApp, or an external one).

It’s fantastic to see that Nokia are working on getting Perl onto their smartphones. I’ve jealously looked on as Python developers have had their language implemented, now it seems that Perl could well be nearing an official launch.

Datasherpa And Automatic Page Tagging

A new product called Datasherpa has just been launched by Clickstream Technoligies with the aim of ensuring all pages served by a webserver are automatically loaded with web analytics tags.

They claim their new product eliminates the burden of creating, inserting and testing page tags, and ensures all pages are tracked accurately.

It’s a really simple idea, and bloody good one. We’ve been caught out before at work when a page hasn’t been correctly tagged and we’ve lost valuable traffic information.

It sounds like it would be really simple to build as a mod_perl handler for Apache. The handler would scan each page served, probably using the HTML::Parser module or even just a simple regular expression, detect the closing page </body> tag, and just before that insert the tracking tag corresponding to the virtual host being served.

We use Webtrends at work, and this approach sounds like it should work really well with their system. It may even be worth mocking up a proof of concept quickly.

Grayscaling Images With Perl

One thing that caught my interest today was how to convert a colour image into grayscale.

It turns out the basic algorithm is very simple. Basically it’s just…

grey = 0.15 * red + 0.55 * green + 0.30 * blue;

This can be turned into a Perl subroutine using the following code.

sub grayscale {

my ($r, $g, $b) = @_;

my $s = 0.15 * $r + 0.55 * $g + 0.30 * $b;

return int($s);

}

Here we pass in the RGB values of the colour we want to turn into gray. We apply the algorithm and return the integer value of gray.

The value we get for gray is used to replace each of the values for red, green and blue.

We can test this subroutine out with the help of the Perl GD module (available for free on CPAN).

#!/usr/bin/perl -w

use GD;

## grayscale subroutine

sub grayscale {

my ($r, $g,$b) = @_;

my $s = 0.15 * $r + 0.55 * $g + 0.30 * $b;

return int($s);

}

## create a new GD object with the data passed via STDIN

my $image = new GD::Image(*STDIN);

## iterate over the number of colours in the colour table

for (my $i = 0; $i < $image->colorsTotal(); $i++) {

## get the RGB values for the colour at index $i

my ($r, $g, $b) = $image->rgb($i);

## convert the RGB to grayscale

my $gray = grayscale($r,$g,$b);

## remove the original colour from the colour table

$image->colorDeallocate($i);

## add in the new gray

$image->colorAllocate($gray,$gray,$gray);

}

## make sure we output binary

binmode STDOUT;

## pass the image as a raw GIF to STDOUT

print $image->gif;

This code takes an image piped in from STDIN and outputs a grayscale GIF version of the image to STDOUT.

If the code was called convert.pl it would be called as ./convert.pl <test.gif >>test_result.gif.

Here’s a conversion I did earlier of a GIF image of Kitt, Bev and Justin at the Emap Performance Awards 2004 using the above Perl code.

Kitt, Bev and Justin in colour

Kitt, Bev and Justin in grayscale

CellTrack’ing Between Colchester And London

I’ve been looking at CellTrack program for series 60 phones recently.

This is a native series 60 Symbian application that can record details of the current mobile phone cell your phone is using. It also lets you annotate each cell if you want.

Celltrack is something I downloaded for my Nokia 7610 a while ago, and have just installed on the Nokia 6630.

Screenshot of CellTrack running on a Nokia 6630

On Monday, while the train was running slow, I had it running and started to annotate stations so I could tell where I was in the evening when it’s dark outside. CellTrack has a feature that allows you to log used cells to a flat tab seperated file. In my case, as I have the software installed on the 6630’s MMC card, the file can be found in the directory E:NokiaOthersCellTrack and copied off using the Nokia PC Suite.

Here’s the journey I took on Tuesday morning by train. I turned on CellTrack at Marks Tey station and had it running to just before the train pulled into Stratford station in East London.

Time	Cell ID	LAC	Cell Name	Description
07:26:08	12972	629	XXBC97 B	Marks tey station
07:27:15	12973	629	XXBC97 C	Approaching marks tey
07:27:35	8812	629	XXB881 B	Approaching kelvedon
07:28:03	4340	629	XXB434 A	no info
07:29:01	4339	629	XXB433 X	Kelvedon station
07:29:25	4341	629	XXB434 A	Approaching kelvedon
07:31:40	16772	629	XXBG77 B	Between witham and kelvedon
07:32:10	16774	629	XXBG77 X	Between kelvedon and witham
07:32:43	2084	629	XXB208 X	Approaching witham
07:34:09	2086	629	XXB208 F	Witham station
07:36:34	382	629	XXB038 B	Approaching witham
07:37:15	2086	629	XXB208 F	Witham station
07:37:55	7249	629	XXB724 X	Hatfield Peveral station
07:38:33	7251	629	XXB725 A	Approaching hatfield peveral
07:39:30	13877	629	XXBD87 G	Approaching hatfield peveral
07:39:40	13878	629	XXBD87 X	Between hatfield peveral and chelmsford
07:39:52	13879	629	XXBD87 X	Between hatfield peveral and chelmsford
07:41:17	3910	629	XXB391 A	Approaching chelmsford
07:41:37	3912	629	XXB391 B	Approaching chelmsford
07:42:07	16055	629	XXBG05 E	Chelmsford station
07:43:01	3877	629	XXB387 G	Chelmsford station
07:43:52	16057	629	XXBG05 G	Approaching chelmsford
07:44:10	3879	629	XXB387 X	Approaching chelmsford
07:44:24	5282	629	XXB528 B	Approaching chelmsford
07:44:46	16779	629	XXBG77 X	Between chelmsford and ingatestone
07:44:58	16778	629	XXBG77 X	Approaching chelmsford
07:45:08	16779	629	XXBG77 X	Between chelmsford and ingatestone
07:45:31	16780	629	XXBG78 A	no info
07:45:49	2073	629	XXB207 C	Between chelmsford and ingatestone
07:46:01	367	629	XXB036 G	Between chelmsford and ingatestone
07:46:11	12354	629	XXBC35 X	Between ingatestone and chelmsford
07:46:25	12355	629	XXBC35 E	Between ingatestone and chelmsford
07:47:03	2073	629	XXB207 C	Between chelmsford and ingatestone
07:47:21	369	629	XXB036 X	Approaching ingatestone
07:47:32	11240	105	XXBB24 A	Approaching ingatestone
07:48:14	11242	105	XXBB24 B	Ingatestone station
07:48:34	3755	105	XXB375 E	Ingatestone station
07:49:14	3756	105	XXB375 F	Between ingatestone and shenfield
07:49:30	11239	105	XXBB23 X	Between shenfield and ingatestone
07:50:09	16872	105	XXBG87 B	Approaching shenfield
07:50:35	16875	105	XXBG87 E	Approaching shenfield
07:50:49	3661	105	XXB366 A	Approaching shenfield
07:51:42	3662	105	XXB366 B	Shenfield station
07:51:54	3663	105	XXB366 C	Shenfield station
07:55:03	531957	0	XXB-76 X	?:no info
07:55:25	531957	65535	XXB-76 X	?:no info
07:55:59	0	0	XXB000 A	?:no info
07:56:50	7240	105	XXB724 A	no info
07:57:26	3788	105	XXB378 X	no info
07:57:52	3789	105	XXB378 X	Approaching gidea park
07:58:09	2068	105	XXB206 X	no info
07:58:19	16035	105	XXBG03 E	Gidea park station
07:59:31	19568	105	XXBJ56 X	no info
07:59:45	5057	105	XXB505 G	no info
08:00:16	197140	3008	XXB-12 F	*:Gidea park station
08:01:09	10925	105	XXBA92 E	no info
08:01:26	5058	105	XXB505 X	Approaching gidea park
08:01:59	6249	700	XXB624 X	Approaching gidea park
08:02:18	1381	700	XXB138 A	no info
08:02:30	197214	3009	XXB-69 A	no info
08:03:19	4829	700	XXB482 X	no info
08:03:23	8611	600	XXB861 A	Seven kings station
08:03:49	7748	600	XXB774 X	no info
08:04:49	11170	700	XXBB17 A	Approaching ilford
08:05:17	9724	600	XXB972 X	Manor park station
08:05:39	3325	600	XXB332 E	Approaching manor park
08:06:02	9726	600	XXB972 F	Manor park station
08:06:16	17536	600	XXBH53 F	Approaching forest gate
08:06:44	17535	600	XXBH53 E	Forest gate station
08:07:55	1335	600	XXB133 E	no info
08:08:19	14197	600	XXBE19 G	no info
08:08:38	10334	700	XXBA33 X	Maryland station

So what do some of the columns mean? Well Cell ID is the ID taken from the actual cell. LAC means the location area code of the cell. I’m not sure what Cell Name actually is, the CellTrack site says it comes from the cell broadcast as I have a service number set. The description is the text I entered to give a rough location to the cell.

As I said before, the log file has the data in tab seperated format. The data is recorded in the following order…

Date
Time
Cell ID
LAC
Country
Net
Signal
Signal dBm
Cell Name
Description

This makes it very easy for us to write a data extractor using Perl. Here’s the code I used to generate the table above.

#!/usr/bin/perl -w

use strict;

## Perl script to parse the CellTrack trace.log file, and split selected

## contents into an HTML table.

## Robert Price - rob@robertprice.co.uk - March 2005

## start the table, and print out a table header.

print "<table>n";

print "  <tr><th>Time</th><th>Cell ID</th><th>LAC</th><th>Cell Name</th><th>Description</th></tr>n";

## iterate over each line, placing the contents in $line.

while (my $line = <>) {

## clean up the data a bit.

chomp($line);		# loose trailing linefeeds.

$line =~ s/r//g;	# loose any rogue carriage returns.

$line =~ s/t */t/g;	# remove preceeding spaces from data.

## split the data in $line into variables.

my ($date,$time,$cellid,$lac,$country,$net,$strength,$dBm,$cellname,$description) = split(/t/,$line);

## create a copy of $time, and format it so it has colons between hours and minutes.

my $nicetime = $time;

$nicetime =~ s/(d{2})(d{2})(d{2})/$1:$2:$3/g;

## print out the data we're interested in.

print "  <tr><td><a link="$time" />$nicetime</td><td>$cellid</td><td>$lac</td><td>$cellname</td><td>$description</td></tr>n";

}

## close the table.

print "</table>n";

You may have noticed I didn’t bother to print the country or network used. Well that’s because it’s always the same for me. The country is 234 (UK) and the network is 33 (Orange). This may be more interesting when travelling abroad and using roaming.

WSSE Authentication For Atom Using Perl

Atom uses the WSSE authentication for posting and editing weblogs.

Mark Pilgrim explains more about this in layman’s terms in an old XML.com article, Atom Authentication.

This information is passed in an HTTP header, for example…

HTTP_X_WSSE     UsernameToken Username="robertprice",  PasswordDigest="l7FbmWdq8gBwHgshgQ4NonjrXPA=", Nonce="4djRSlpeyWeGzcNgatneSA==", Created="2005-2-5T17:18:15Z"

We need 4 pieces of information to create this string.

Username
Password
Nonce
Timestamp

A nonce is a cryptographically random string in this case, not the word Clinton Baptiste gets in Phoenix Nights (thanks to Matt Facer for the link). In this case, it’s encoded in base64.

The timestamp is the current time in W3DTF format.

The for items are then encoded together to form a password digest that is used for the verification of the authenticity of the request on the remote atom system. As it already knows you username and password, it can decrypt the password the nonce and timestamp passed in the WSSE header. It uses the well known SHA1 algorithm to encrypt the pasword and encodes it in base64 for transportation across the web.

We can use Perl to create the password digest, as shown in this example code.

my $username = "robertprice";

my $password = "secret password";

my $nonce = "4djRSlpeyWeGzcNgatneSA==";

my $timestamp = "2005-2-5T17:18:15Z";

my $digest = MIME::Base64::encode_base64(Digest::SHA1::sha1($nonce . $timestamp . $password), '');

The password digest is now stored in the variable $digest.

We can also create the HTTP header from this if needed.

print qq{HTTP_X_WSSE     UsernameToken Username="$username", PasswordDigest="$digest", Nonce="$nonce", Created="$created"n};

Please note, to use this Perl code, you have to have the MIME::Base64 and Digest::SHA1 modules installed. Both are freely available on CPAN.

Update – 22nd November 2006

Some more recent versions of Atom expect the digest to be generated with a base64 decoded version of the nonce. Using the example above, some example code for this would be…



## generate alternative digest

my $alternative_digest = MIME::Base64::encode_base64(Digest::SHA1::sha1(MIME::Base64::decode_base64($nonce) . $timestamp . $password), '');

When using WSSE for password validation, I now always check the incoming digest with both versions of my generated digested to ensure it’s compatible with different versions of Atom enabled software. One of the best examples of this is the Nokia Lifeblog. Older versions expect the nonce to be left, newer versions expect the nonce to be decoded first.

Precompiling Templates With Template Toolkit

I’ve been playing about with configuration options in the Template Toolkit to try to improve the performance of a site I maintain.

I’ve been focusing on the caching and compiling options in particular.

By setting the COMPILE_DIR and COMPILE_EXT options, Template Toolkit automatically compiles all the templates it uses to the specified directory. Once they are compiled, Template Toolkit will try to use them instead of the original template wherever possible. This seems to be giving some real speed increases and also reducing the load on the server.

my $template = Template->new({

COMPILE_DIR => '/tmp/compiled_templates',

COMPILE_EXT => '.ttc',

});

Here we are storing our compiled templates in the /tmp/compiled_templates directory. Template Toolkit replicates the directory structure of the original template under this automatically. We’re also saying we want all compiled templates to end in the file extension .ttc.

It definately seems to be a quick win for improving the performance of Template Toolkit based sites.

Parsing RDF In Perl With RDF::Simple

In this article I’ll describe how to parse and extract data from an RDF file using Jo Walsh‘s RDF::Simple::Parser module in Perl.

RDF::Simple::Parser does what it says on the tin, it provides a simple way to parse RDF. Unfortunately, that can make it hard to extract data. All it returns from a successful parse of the RDF file, is what Jo calls a “bucket-o-triples”. This is just an array of arrays. The first array contains an list of all the triples. The second array contains the actual triples broken down so Subject is in position 0, Predicate is in position 1 and Object in position 2.

Let’s define these as constants in Perl as they’re not going to be changing.

use constant SUBJECT    => 0;

use constant PREDICATE  => 1;

use constant OBJECT     => 2;

I’m going to use my usual example of my parsing my FOAF file, and I’ll be extracting the addresses of my friend’s FOAF files from it. See the example in What Is An RDF Triple, for a full breakdown of this.

We’ll define the two predicates we need to look for as constants.

use constant KNOWS_PREDICATE => 'http://xmlns.com/foaf/0.1/knows';

use constant SEEALSO_PREDICATE => 'http://www.w3.org/2000/01/rdf-schema#seeAlso';

We need to load in the FOAF file, so we’ll take advantage of File::Slurp’s read_file method to do this and put it in a variable called $file.

my $file = read_file('./foaf.rdf');

Before we can use RDF::Simple::Parser, we need to create an instance of it. I’ll set the base address to www.robertprice.co.uk in this case.

my $parser = RDF::Simple::Parser->new(base => 'http://www.robertprice.co.uk/');

Now we have the instance, we can pass in our FOAF file for parsing and get back our triples.

my @triples = $parser->parse_rdf($file);

Let’s take a quick look at my FOAF file to get an example triple.

I know Cal Henderson, and this is represented in my FOAF file as…

<foaf:knows>

<foaf:Person>

<foaf:nick>Cal</foaf:nick>

<foaf:name>Cal Henderson</foaf:name>

<foaf:mbox_sha1sum>2971b1c2fd1d4f0e8f99c167cd85d522a614b07b</foaf:mbox_sha1sum>

<rdfs:seeAlso rdf:resource="http://www.iamcal.com/foaf.xml"/>

</foaf:Person>

</foaf:knows>

Using the RDF validator we can get a the list of triples represented in this piece of RDF.

Triple	Subject	Predicate	Object
1	genid:ARP40722	http://www.w3.org/1999/02/22-rdf-syntax-ns#type	http://xmlns.com/foaf/0.1/Person
2	genid:ARP40722	http://xmlns.com/foaf/0.1/nick	"Cal"
3	genid:ARP40722	http://xmlns.com/foaf/0.1/name	"Cal Henderson"
4	genid:ARP40722	http://xmlns.com/foaf/0.1/mbox_sha1sum	"2971b1c2fd1d4f0e8f99c167cd85d522a614b07b"
5	genid:ARP40722	http://www.w3.org/2000/01/rdf-schema#seeAlso	http://www.iamcal.com/foaf.xml
6	genid:me	http://xmlns.com/foaf/0.1/knows	genid:ARP40722

The part we are interested are triples 5 and 6. We can see that triple 6 has Predicate value the same as our KNOWS_PREDICATE constant, and triple 5 has the Predicate value of our SEEALSO_PREDICATE constant. The part this links the two is that triple 6 has the Object value of triple 5’s Subject.

We know if we search for triples with the same predicate as our KNOWS_PREDICATE we’ll get triples that are to do with people I know. We can use Perl’s grep function to get these triples, then we can interate over them in a foreach loop.

foreach my $known (grep { $_->[PREDICATE] eq KNOWS_PREDICATE } @triples) {

We are only interest in the triples that have the same Subject as matching triple’s Object. Again, we can use grep to get these out so we can interate over them.

foreach my $triple (grep { $_->[SUBJECT] eq $known->[OBJECT] } @triples) {

Now we just need to make sure that the triple’s Predicate matches our SEEALSO_PREDICATE constant, and if it does, we can print out the value of it from it’s Object.

if ($triple->[PREDICATE] eq SEEALSO_PREDICATE) {

print $triple->[OBJECT], "n"

}

Let’s put this all together into a working example…

#!/usr/bin/perl -w

use strict;

use File::Slurp;

use RDF::Simple::Parser;

## constants defining position of triple components in

## RDF::Simple triple lists.

use constant SUBJECT    => 0;

use constant PREDICATE  => 1;

use constant OBJECT     => 2;

## some known predicates.

use constant KNOWS_PREDICATE => 'http://xmlns.com/foaf/0.1/knows';

use constant SEEALSO_PREDICATE => 'http://www.w3.org/2000/01/rdf-schema#seeAlso';

## read in my foaf file and put it in $file.

my $file = read_file('./foaf.rdf');

## create a new parser, using my domain as a base.

my $parser = RDF::Simple::Parser->new(base => 'http://www.robertprice.co.uk/');

## parse my foaf file, and return a list of triples.

my @triples = $parser->parse_rdf($file);

## iterate over a list of triples matching the KNOWN_PREDICATE value.

foreach my $known (grep { $_->[PREDICATE] eq KNOWS_PREDICATE } @triples) {

## iteratve over a list of triples that have the same subject

## as one of our KNOWN_PREDICATE triples object.

foreach my $triple (grep { $_->[SUBJECT] eq $known->[OBJECT] } @triples) {

## find triples that match the SEEALSO_PREDICATE

if ($triple->[PREDICATE] eq SEEALSO_PREDICATE) {

## print out the object, should be the address

## of my friends foaf file.

print $triple->[OBJECT], "n"

}

}

}

The example will load in the FOAF file, parse it and print out any friends of mine that have FOAF files defined by the seeAlso predicate.

Querying RDF In Perl With RDFStore

Apart from RDF::Core and Redland, another option for parsing and querying RDF in Perl is RDFStore. This also provides the Perl RDQL::Parser module used by the very useful DBD::RDFStore driver.

Following on from the previous examples showing how to extract information from my FOAF file using RDF::Core (Query RDF In Perl With RDF::Core) and RDF::Redland (Querying RDF In Perl With RDF::Redland), here I’ll re-implement the query using RDFStore.

As a quick recap from the previous articles, here is the bit of RDF we want to extract information from.

<foaf:knows>

<foaf:Person>

<foaf:nick>Cal</foaf:nick>

<foaf:name>Cal Henderson</foaf:name>

<foaf:mbox_sha1sum>2971b1c2fd1d4f0e8f99c167cd85d522a614b07b</foaf:mbox_sha1sum>

<rdfs:seeAlso rdf:resource="http://www.iamcal.com/foaf.xml"/>

</foaf:Person>

</foaf:knows>

The solution used to extract the data from the RDF looks a lot more Perl-like than the previous examples we have seen.

If you have ever queried databases using SQL in Perl, then you have certainly come across the powerful DBI module. This abstracts the common database usage making it possible to very easily port your applications between various databases. One of the best things about using RDFStore is that it provides a DBD driver allowing you to use standard DBI methods when querying your RDF data. Unlike other modules that make you create triple stores and factory methods, RDFStore lets that be hidden from you.

To start with we’ll need to create a database handle using DBI and the DBD::RDFStore modules and store it in the variable $dbh.

my $dbh = DBI->connect("DBI:RDFStore:");

This creates a database on the fly, but we can connect to an existing database on a local or remote server if we so wished.

Now we need to create our RDQL query. It looks very similar to the query we used in the Redland example.

my $query = $dbh->prepare(<<QUERY);

SELECT ?name ?nick ?seeAlso ?mbox_sha1sum

FROM <file:foaf.rdf>

WHERE

(?x <rdf:type> <foaf:Person>),

(?x <foaf:name> ?name)

(?x <foaf:nick> ?nick)

(?x <rdfs:seeAlso> ?seeAlso)

(?x <foaf:mbox_sha1sum> ?mbox_sha1sum)

AND

(?nick eq 'Cal')

USING

foaf for <http://xmlns.com/foaf/0.1/>,

QUERY

Here we’re selecting the values the name, nick, seeAlso and mbox_sha1sum triples for a Person with the nick of Cal. We’ve explicitly set where our triples come from using the FROM clause. In this case, it’s the file foaf.rdf, which contains my FOAF information.

We have the query in the variable $query, so lets execute it.

$query->execute();

We can use standard DBI methods to fetch the data from our query. Here I’m going to create some bound variables to keep any matching data in.

my ($name, $seeAlso, $mbox_sha1sum, $nick);

$query->bind_columns($name, $nick, $seeAlso, $mbox_sha1sum);

Now we just have to fetch each row that matches our query and print them out.

while ($query->fetch()) {

print $name->toString, "n";

print $nick->toString, "n";

print $seeAlso->toString, "n";

print $mbox_sha1sum->toString, "n";

}

The values returned are either RDFStore::Literal or RDFStore::Resource objects, so we have to use their toString methods to print them.

To tidy up, we’ll finish our query and disconnect from our database.

$query->finish;

$dbh->disconnect;

That’s it! It really is as simple as that.

Let’s put this all together now to produce our final example code listing.

#!/usr/bin/perl -w

## An example showing how to use RDFStore and RDQL::Parser to

## extract information from a FOAF file.

## Copyright 2004 - Robert Price - http://www.robertprice.co.uk/

use strict;

use DBI;

## create a DBI connection to our NodeFactory.

my $dbh = DBI->connect("DBI:RDFStore:");

## prepare our query.

my $query = $dbh->prepare(<<QUERY);

SELECT ?name ?nick ?seeAlso ?mbox_sha1sum

FROM <file:foaf.rdf>

WHERE

(?x <rdf:type> <foaf:Person>),

(?x <foaf:name> ?name)

(?x <foaf:nick> ?nick)

(?x <rdfs:seeAlso> ?seeAlso)

(?x <foaf:mbox_sha1sum> ?mbox_sha1sum)

AND

(?nick eq 'Cal')

USING

foaf for <http://xmlns.com/foaf/0.1/>,

QUERY

## execute the query.

$query->execute();

## define some holding variables and bind them to our query results.

my ($name, $seeAlso, $mbox_sha1sum, $nick);

$query->bind_columns($name, $nick, $seeAlso, $mbox_sha1sum);

## while we have results being returned...

while ($query->fetch()) {

## print out the values.

## As these can be RDFStore::Literal or RDFStore::Resource's we

## need to use the toString method of these objects to print.

print $name->toString, "n";

print $nick->toString, "n";

print $seeAlso->toString, "n";

print $mbox_sha1sum->toString, "n";

}

## end the query and disconnect.

$query->finish;

$dbh->disconnect;

In conclusion, RDFStore provides a very clean and Perlish interface to querying RDF data. The code implements a DBD module allowing standard DBI methods to be used, making it quick and simple for Perl developers to learn and use effectively.