TNX Web News As HTML

Whilst trying to write the entry TV Being Installed On South Central Trains I wanted to get some quotes from the TNX website. However, it's flash based so I wasn't able to cut and paste the text, and I didn't feel like typing it by hand.

I had a look at what the flash file was doing over my network by running Ethereal.

It turns out the flash file is calling the news stories from a dedicated page at http://www.tnx.tv/selectNewsArchive.asp.

This file returns the number of news items, and the items themselves in the same format as a CGI POST. For example...

&noOfRecords=10&ID0=17&Title0=TNX+Television+Enters+into+Agreement+w

The body data returned is in html format, so it's trivial to output the news to a browser. Here is an example Perl CGI script that can be run to return the news to a browser.

#!/usr/bin/perl -w use strict; use LWP::Simple; use CGI; my $url = 'http://www.tnx.tv/selectNewsArchive.asp'; my $page = get $url; my $CGI = new CGI($page); print $CGI->header(); print "<html><head><title>TNX News</title></head><body>\n"; for (my $x = 0; $x < $CGI->param('noOfRecords'); $x++) { print '<h1>' . $CGI->param("Title$x") . "</h1>\n"; print '<img src="http://www.tnx.tv/images/' . $CGI->param("Image$x") . '">' . "\n"; print $CGI->param("Body$x") . "\n\n"; } print "</body></html>\n";

We download the page, pass it to the CGI module, and just iterate over the data, outputting the results as we go.

Entered: 2004-05-07 23:28:48
Modified: 2004-05-22 23:52:00

Rob's Other Blog Entries

See other blog entries for May 2004, or an index of all blog entries.