blog archive contact about feed

Heathrow Arrivals

IMPORTANT

15th April 2006 - The perl script below has now been update to work with the current BAA website. It's new location can be found at... http://www.robertprice.co.uk/robblog/archive/2006/4/Updated_Heathrow_Arrivals_WAP_Service.shtml.

Introduction

Needing to know when to meet a flight at London's Heathrow airport, I had a look on the web to see what I could find. The BAA have live Heathrow arrivals on their website. They offer a mobile solution called BAA Mobile Services, that works only on certain networks, and costs 30p excluding VAT.

Not wanting to pay for a service that is free for web surfers, I decided to try to write a screen scraper that would allow me to get live details from via WAP phone. There are are no terms and conditions on the BAA website forbidding this, so I believe this is legal.

The script provides you with search functionality, where you enter the flight number you are interested in, and it returns any information it can find relating to it. A search field lets you search for other flights if you wish to do so.

Perl Script

The Perl script gets the arrivals page then uses a regular expression to iterate over the page, extracting the information and storing it in a hash with the flight number as the key. We then compare the flight number we want to the hash and get the relevant information out.

The script lets you define a DEFAULT_FLIGHT constant, in this case I am interested in flight BA911.

#!/usr/bin/perl -w ## Script to screen scrape Heathrow arrivals from ## the BAA website, and show them on a WAP page. ## Robert Price - 26/01/2004 use strict; use CGI; use LWP::Simple; ## default flight number use constant DEFAULT_FLIGHT => 'BA911'; ## url of the heathrow arrivals, seems to be the ## same format for other airports. my $url = 'http://www.baa.com/arrivals/LHR_2.html'; ## get the flight we're interested in. my $CGI = new CGI; my $flight = $CGI->param('flight') || DEFAULT_FLIGHT; ## uppercase the flight to make it easier to search. $flight = uc($flight); ## send the header and start of the WML page. print<<HEADER; Content-type: text/vnd.wap.wml <?xml version="1.0"?> <!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml"> <wml> <card id="main" title="Flights"> <p>Arrivals at Heathrow</p> HEADER ## if we're looking for a flight... if ($flight) { ## get the arrivals page. my $page = get $url; ## holding hash for flights. my %arrivals = (); ## iterate over the page extracting information. while ($page =~ m[ <tr\s\sid='l.*?'\s> # opening tr <td><b>(.*?)</b></td> # scheduled time <td>(.*?)</td> # flight number <td>(.*?)</td> # from \s<td>(.*?)</td> # status \s<td>(.*?)</td> # termial </tr> # closing tr ]xsg) { ## skip if we have no flight number next unless ($2); ## store the information in the hash. $arrivals{$2} = { 'scheduled_time' => $1, 'flight_number' => $2, 'from' => $3, 'status' => $4, 'terminal' => $5, }; } ## get the page modification time. my ($modified) = ($page =~ m[var Modified='(.*?)';]g); ## if we have found the flight, print ## out it's information to the WML page. if (exists $arrivals{$flight}) { my $details = $arrivals{$flight}; print "<p>\n"; print 'Flight: ' . $details->{'flight_number'} . "<br/>\n"; print 'From: ' . $details->{'from'} . "<br/>\n"; print 'Scheduled:' . $details->{'scheduled_time'} . "<br/>\n"; print 'Status: ' . $details->{'status'} . "<br/>\n"; print 'Terminal: ' . $details->{'terminal'} . "<br/>\n"; print "</p>\n"; } else { print "<p>Flight $flight doesn't seem to be listed</p>\n"; } print "<p>Updated: $modified</p>\n"; } ## print the end of the page and the query ## form if we want to search again. print<<FOOTER; <p> Flight:<br/> <input name="flight" emptyok="false"/> <br/> </p> <p> <anchor title="search"> <go href="/cgi-bin/mobile/arrivals.pl" method="post"> <postfield name="flight" value="\$flight"/> </go> Find Arrivals </anchor> </p> </card> </wml> FOOTER