TNX Web News As HTML
Whilst trying to write the entry TV Being Installed On South Central Trains I wanted to get some quotes from the TNX website. However, it's flash based so I wasn't able to cut and paste the text, and I didn't feel like typing it by hand.
I had a look at what the flash file was doing over my network by running Ethereal.
It turns out the flash file is calling the news stories from a dedicated page at http://www.tnx.tv/selectNewsArchive.asp.
This file returns the number of news items, and the items themselves in the same format as a CGI POST. For example...
&noOfRecords=10&ID0=17&Title0=TNX+Television+Enters+into+Agreement+w
The body data returned is in html format, so it's trivial to output the news to a browser. Here is an example Perl CGI script that can be run to return the news to a browser.
#!/usr/bin/perl -w
use strict;
use LWP::Simple;
use CGI;
my $url = 'http://www.tnx.tv/selectNewsArchive.asp';
my $page = get $url;
my $CGI = new CGI($page);
print $CGI->header();
print "<html><head><title>TNX News</title></head><body>\n";
for (my $x = 0; $x < $CGI->param('noOfRecords'); $x++) {
print '<h1>' . $CGI->param("Title$x") . "</h1>\n";
print '<img src="http://www.tnx.tv/images/' . $CGI->param("Image$x") . '">' . "\n";
print $CGI->param("Body$x") . "\n\n";
}
print "</body></html>\n";
We download the page, pass it to the CGI module, and just iterate over the data, outputting the results as we go.
