Train Departure And Arrival RSS Feeds
I've been taking a greater interest in RSS feeds lately, and have installed a few feed readers.
One thing I've found that doesn't seem to exist is the ability to track train departure times via my feed reader. I thought this would be a great as I could see at a glance if my train was going to be late or not. With this is mind I visited NationalRail.co.uk's live departure board website and wrote a quick screen scraper to provide such a feed.
The script can be found a http://www.robertprice.co.uk/cgi-bin/train_departure_rss.pl, and takes a single parameter of stationcode with the station's code in it. The codes for national stations can be found on NationalRail.co.uk's live departure board website. For example, Hampden Park's code is HMD, as found at http://www.livedepartureboards.co.uk/ldb/summary.aspx?T=HMD. So, if we want to check this station's departures via the RSS feed we have to use the URL http://www.robertprice.co.uk/cgi-bin/train_departure_rss.pl?stationcode=HMD.
Here's an example of using NetNewsWire Lite on OS X.

Use of this script may break NationalRail.co.uk's acceptable use policy which prevents scripts from drawing large amounts of data from the site, so use it at your own risk, and please don't spoil this excellent service for us all. I believe it is OK to run this script occassionally for personal use, so to minimise impact the RSS feed is set to refresh every 10 minutes, where as the website refreshes every 2 minutes.
Users of mobile devices may like to check out NationalRail.co.uk's WAP service as they have a version of the departure boards for mobile phones.
The Perl script is included below so you can install it on your own server if needed, and also save my personal bandwidth. :-)
Please note that the code below has been superceded by the code on the page New Train Departure And Arrival RSS Feeds.
#!/usr/bin/perl -w
## Script to convert train departures from livedepartureboards.co.uk
## into RSS feeds.
## Takes the parameter "stationcode", as defined on the page nationalrail.co.uk
## page - http://www.nationalrail.co.uk/frameset.asp?location=ldb
##
## Robert Price - www.robertprice.co.uk - 10 July 2004
use strict;
use CGI;
use HTTP::Request;
use LWP::UserAgent;
use XML::RSS;
## Take the stationcode parameter, else default to COL (Colchester)
my $CGI = new CGI;
my $stationcode = $CGI->param('stationcode') || 'COL';
## The URL we have to screen scrape for the information.
my $url = 'http://www.livedepartureboards.co.uk/ldb/summary.aspx?T=' . $stationcode;
## Create our new browser, and pretend to be IE.
my $ua = LWP::UserAgent->new;
$ua->agent("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)");
## Get the data from the livedeartureboards.co.uk website.
my $req = HTTP::Request->new(GET => $url);
$req->header('Accept' => 'text/html');
my $res = $ua->request($req);
if ($res->is_success) {
my $page = $res->content;
## Get the station name and when the board was last updated.
my ($updated) = ($page =~ m[Last updated:\r\n(.*?)\r\n]sg);
my ($station_name) = ($page =~ m[<H1>Train Times for\r\n(.*?)\r\n]sg);
## Create our RSS data, update every 10 minutes.
my $rss = new XML::RSS (version => '1.0');
$rss->channel(
title => $station_name . ' Train Departures',
link => 'http://www.nationalrail.co.uk/',
description => 'Train times from ' . $station_name,
syn => {
updatePeriod => 'hourly',
updateFrequency => '6'
},
);
## Loose nbsp's (they're wrong on the site so no ;)
$page =~ s/\ //sg;
## Monster regex to extract the data.
while ($page =~ m[<tr.*?>\r\n<td><a href="(.*?)">(.*?)</a></td>\r\n<td.*?>(.*?)</td>\r\n<td><span.*?>(.*?)</span></td>\r\n<td><a href=".*?">(.*?)</a></td>\r\n<td.*?>(.*?)</td>\r\n<td><span.*?>(.*?)</span></td>\r\n<td><a href=".*?">(.*?)</a></td>\r\n</tr>]sg) {
## Build a hash to store the information in.
my $train = {
'link' => 'http://www.livedepartureboards.co.uk/ldb/' . $1,
'from' => $2,
'timetabled_arrival' => $3,
'expected_arrival' => $4,
'to' => $5,
'timetabled_departure' => $6,
'expected_departure' => $7,
'operator' => $8
};
## Add the train departure to the RSS feed.
$rss->add_item(
title => ($train->{'timetabled_departure'} ? $train->{'timetabled_departure'} . ' to ' : '') . $train->{'to'} . ($train->{'expected_departure'} ? ( ' (' . $train->{'expected_departure'} . ')' ) : ''),
link => $train->{'link'},
description =>
"From: " . $train->{'from'} . ",\n" .
"Timetabled Arrival: " . $train->{'timetabled_arrival'} . ",\n" .
"Expected Arrival: " . $train->{'expected_arrival'} . ",\n" .
"To: " . $train->{'to'} . ",\n" .
"Timetabled Departure: " . $train->{'timetabled_departure'} . ",\n" .
"Expected Departure: " . $train->{'expected_departure'} . ",\n" .
"Operator: " . $train->{'operator'} . "\n",
);
}
## Return the RSS feed, ensuring we have the correct MIME type.
print "Content-type: application/rdf+xml\n\n";
print $rss->as_string;
}





