Tcl Library Source Code

Saving NetNews with Tcllib
Login

Originally posted at http://core.tcl.tk/akupries/blog/saving-news.html


Given my various interests I am following several groups like <news:comp.lang.tcl> and <news:comp.risks> on NetNews, a global bulletin board system which was started shortly after the internet itself.

Due to the ephemeral nature of the various boards' contents, with most servers keeping messages for only a week or two, any access to older messages means that I either have go to some website which backs them up, like Google Groups, or save them on my own.

Here I describe how to do the latter, using Tcl and Tcllib.

We will need access to Tcllib's sources even if it is already installed from your favorite distribution's repositories. This is because we will be using the two scripts

pullnews
and
dirstore
found under examples/nntp to accomplish our task, and I know of no distribution that installs the Tcllib examples.

Edit: Stuart Cassoff tells me that OpenBSD does install the examples, since 2008.

Next, we need an account, i.e., a user name and a password, with a host serving NetNews via NNTP. If your ISP does not provide one then you have to use one of several specialized providers, like Eternal September.

With that done below are my script

#!/bin/sh
#--
GROUP=comp.lang.tcl
#--
BASE=$HOME/Projects/Backups/News
ACCOUNT=$BASE/etc/eternal-september.org
SERVER=news.eternal-september.org
SAVETO=$BASE/archive/$GROUP
BINDIR=$BASE/bin

$BINDIR/pullnews -via $ACCOUNT $SERVER $GROUP \
$BINDIR/dirstore $SAVETO

and its account file:

the-user-name
the-user-password
(additional optional lines ignored by pullnews)

Well, not quite. My actual paths are slightly different, I am not telling anybody my account information, and the group name is an argument. Making the equivalent changes is left as an exercise for the reader.

Some explanations and notes are now likely in order:

Now we have a functioning backup, although our storage system is quite simple - just a directory.

If we want to use a storage system that supports more features, like an index, searching, etc., we have to look under the hood of

pullnews
a bit to see how it talks to the
dirstore
.

The relevant procedure is

store_cmd
, which encapsulates the builtin
exec
. It is called twice:

  1. store_cmd {} last

    This call queries the store for the sequence number of the last stored article, expecting it on stdout. If the result is empty

    pullnews
    will use the sequence number of the oldest article known to the host instead.

    This is how it pulls the entire backlog on its first run and only the new articles on all subsequent runs.

  2. store_cmd $data save $lasthandled

    This saves a retrieved article into the store, with the specified sequence number. The article data is presented to the store on stdin.

Not very complicated. Any storage command which follows this simple API can be used as a backend of

pullnews
.

Happy Tcling.