Querying RDF In Perl with RDF::Redland
Previously I've shown how to parse and query RDF files using the Perl RDF::Core modules.
If you are serious about using RDF with Perl then you should take a look at Dave Beckett's excellent Redland framework.
Redland is an free, open source C library for parsing, storing and querying RDF files. The Perl bindings can be found on CPAN as RDF::Redland.
Redland has several advantages over some of the other Perl RDF libraries. It's small, cross platform, very fast, standards compliant and written by one of the authors of the RDF specification for the W3C. The downside is that it can be a bit tricky to install and some of the Perl documentation is lacking in depth.
Lets have a look at parsing and querying my FOAF file to extract the same information as in my previous article How To Query RDF In Perl Using RDF::Core.
Firstly we need to setup Redland. It works in a similar way to RDF::Core in that it you have to define some storage and create a model to use. In this example we'll use memory, but Redland also lets you bind with various databases. See the documentation in RDF::Redland::Storage and Redland itself for more details.
my $storage = new RDF::Redland::Storage("memory");
die "unable to create storage" unless $storage;
my $model = new RDF::Redland::Model($storage, "");
die "unable to create model" unless $model;
Next we need to define the file we want to parse. In this case it's a local file called foaf.rdf. Redland uses an RDF::Redland::URI object to define each data source.
my $uri = new RDF::Redland::URI("file:foaf.rdf");
This RDF file is in XML format, so we'll need to create a parser capable to reading it.
my $parser = new RDF::Redland::Parser("rdfxml", "application/rdf+xml");
die "unable to create parser" unless $parser;
We have all we need to parse the file, so lets do it. We'll need to add each triple the parser finds to the $model so we query it later.
my $stream = $parser->parse_as_stream($uri, $uri);
while(!$stream->end) {
$model->add_statement($stream->current);
$stream->next;
}
At this stage we have all the triples in our model ready to query. We'll use Redland's query functionality (provided by the rasqal library) to extract information about Cal from my FOAF file. The triples we want are in this bit of RDF.
<foaf:knows>
<foaf:Person>
<foaf:nick>Cal</foaf:nick>
<foaf:name>Cal Henderson</foaf:name>
<foaf:mbox_sha1sum>2971b1c2fd1d4f0e8f99c167cd85d522a614b07b</foaf:mbox_sha1sum>
<rdfs:seeAlso rdf:resource="http://www.iamcal.com/foaf.xml"/>
</foaf:Person>
</foaf:knows>
For a detailed explanation of the above, see my article What Is An RDF Triple.
Redland uses RDQL to query the data. This is a very powerful query language that looks a bit SQL and RDF::Core's query language. We have to provide a match for each triple we are looking for.
Let's get an RDQL query to get Cal's name, nickname, FOAF file URL and mbox checksum into a variable called $string.
my $string = <<EOF_QUERY;
SELECT ?name ?nick ?seeAlso ?mbox_sha1sum
WHERE
(?x rdf:type foaf:Person),
(?x foaf:name ?name),
(?x rdfs:seeAlso ?seeAlso),
(?x foaf:mbox_sha1sum ?mbox_sha1sum),
(?x foaf:nick ?nick)
AND
(?nick eq 'Cal')
USING
foaf FOR <http://xmlns.com/foaf/0.1/>
EOF_QUERY
In the SELECT saying we want to extract the data represented by the place holders ?name, ?nick, ?seeAlso and ?mbox_sha1sum.
We define what these variables mean in the WHERE block. They are triples in the usual standard of subject, predicate and object format. They all need to have the same subject so we'll define a temporary variable called ?x to query against.
We make the query concrete by defining that ?x must have the rdf:type of foaf:Person. We also use the AND block to say that variable ?nick must equal 'Cal'.
We also define the foaf namespace in the USING block.
Putting this all together means we want a selection of triples for the foaf:Person with the nickname of Cal.
We have our query string, so lets actually query our data model.
my $query = new RDF::Redland::Query($string);
my $results = $model->query_execute($query);
$results now contains an RDF::Redland::QueryResults object with the results of our RDQL query in. This is basically just a sequence of RDF::Redland::Node's. We need to iterate over this to get each matching set of triples and display them. In our case as there is only one entry for Cal we should only get one set of triples back.
while (!$results->finished) {
## print the triple values out.
print node_value($results->binding_value_by_name('name')), "\n";
print node_value($results->binding_value_by_name('nick')), "\n";
print node_value($results->binding_value_by_name('seeAlso')), "\n";
print node_value($results->binding_value_by_name('mbox_sha1sum')), "\n";
## get the next set of results.
$results->next_result;
}
We're using the binding_value_by_name method from RDF::Redland::QueryResults to specify which triples we want back. As RDF::Redland::Node's can represent literal values as well as URI's and black nodes, we need to make sure we only get the relevant string back to print. To handle this neatly, we write a subroutine called node_value to return a suitable string value.
sub node_value {
my $node = shift;
## if the node is a resource, then return the URI's string value.
if ($node->is_resource()) {
return $node->uri->as_string;
## if the node is a literal, return the string value.
} elsif ($node->is_literal()) {
return $node->as_string;
## else return an empty string
} else {
return "";
}
}
Running the code should print to the screen...
Cal Henderson
Cal
http://www.iamcal.com/foaf.xml
2971b1c2fd1d4f0e8f99c167cd85d522a614b07b
Let's put the code together to produce the final script.
#!/usr/bin/perl -w
## An example showing how to use Redland's Perl bindings to
## extract information from a FOAF file.
## Copyright 2004 - Robert Price - http://www.robertprice.co.uk/
use strict;
use RDF::Redland;
## Create storage, in this case we'll be using memory.
my $storage = new RDF::Redland::Storage("memory");
die "unable to create storage" unless $storage;
## Create a model using our storage.
my $model = new RDF::Redland::Model($storage, "");
die "unable to create model" unless $model;
## We want to access a local copy of our FOAF file, so define that here.
my $uri = new RDF::Redland::URI("file:foaf.rdf");
## create an RDF XML parser as that's the format of our FOAF file.
my $parser = new RDF::Redland::Parser("rdfxml", "application/rdf+xml");
die "unable to create parser" unless $parser;
## parse the file and add each triple found to our model.
my $stream = $parser->parse_as_stream($uri, $uri);
while(!$stream->end) {
$model->add_statement($stream->current);
$stream->next;
}
## build the query and story it in $string.
my $string = <<EOF_QUERY;
SELECT ?name ?nick ?seeAlso ?mbox_sha1sum
WHERE
(?x rdf:type foaf:Person),
(?x foaf:name ?name),
(?x rdfs:seeAlso ?seeAlso),
(?x foaf:mbox_sha1sum ?mbox_sha1sum),
(?x foaf:nick ?nick)
AND
(?nick eq 'Cal')
USING
foaf FOR <http://xmlns.com/foaf/0.1/>
EOF_QUERY
## create and run the query, returning a results iterator.
my $query = new RDF::Redland::Query($string);
my $results = $model->query_execute($query);
## while we have results...
while (!$results->finished) {
print node_value($results->binding_value_by_name('name')), "\n";
print node_value($results->binding_value_by_name('nick')), "\n";
print node_value($results->binding_value_by_name('seeAlso')), "\n";
print node_value($results->binding_value_by_name('mbox_sha1sum')), "\n";
## get the next set of results.
$results->next_result;
}
## Utility subroutine to return a nodes string value.
sub node_value {
my $node = shift;
## if the node is a resource, then return the URI's string value.
if ($node->is_resource()) {
return $node->uri->as_string;
## if the node is blank, return an empty string.
} elsif ($node->is_literal()) {
return $node->as_string;
## else return it's normal string value.
} else {
return "";
}
}
This article will have hopefully shown you how easy it is to query RDF data in Perl and return meaningful results using Redland.
