back to list

Re: Mills archive

🔗Robert Walker <robert_walker@rcwalker.freeserve.co.uk>

5/13/2001 2:58:24 AM

Hello there!

I'm still working on the archive, but thought might say what stage
it has reached so far, for those interested in it.

We seem to have pretty much all the messages between Monz and
Manuels archives.

Also, I've figured out how to combine them into a single archive,
with the lists spanning all sections spanning both the Mills and
Yahoo posts, and coded that, and it works - tried combining justmusic
with another group that I downloaded to test the program earlier
to try the concatenation for a small archive first - the other one
was called Looms, so got interweaved posts about JM
and about weaving in the complete lists of threads and individuals!.

You can concatenate the archives for any number of groups into a single
archive in the same way.

A few more details to be dealt with, not many.

Plus, when I do the first upload of the archive, it may have some
early messages out of order, as I order them by date, and they use
varying time zones and may have varying message travel times too
so that some that were posted earlier may actually have been relayed
by the Mills list-server later, or vice versa.

There seems to be no other way to do it as there are no
message numbers in the posts (though I could get the program to
check the time zones field that some messages have in the date
and do time zone conversion, if there is a nice table of
time zone codes somewhere or something I can use for that)

May just have to live with the possibility that some may remain
out of sequence, perhaps with message about it on the contents page.
Hopefully will be worth it even with that.

It may possibly have duplicated messages too, as sometimes a
message is in both archives, but with different times in the
message date fields, otherwise identical. Program searches
for these by looking for two messages immediately following
each other that are identical, but could miss them if
there are other messages in between.

However, once we are fairly sure that the archive _is_ complete,
then it is an easy matter to go through and change the message
numbers for any posts that are clearly out of sequence (such as,
someone posts a file, and someone else comments on it apparently
before it was posted), and to delete any duplicates anyone notices.

I want to complete the archive first, as the message numbers change
depending on how many messages are in the archive.

At some much later date, may add a little search engine too,
- the archiver program would make lists of all the message numbers for
each word in the archive, then maybe have a java applet or app.
that would read the lists of message numbers for each word in the search phrase,
and filter by each word in turn, and make a html page for the results.
I think perhaps it would need to be a java app. rather than applet
to do that, for the file permissions.

I think that would be practical, and probably reasonably fast too.
Also not too complex in the way of java, so I could reasonably
code it if some time I do some more java programming and
get into the way of it again. (Or if anyone else would like
to write one, the lists of message numbers for each word would
be pretty easy for my message archiver to make, or, maybe
there is a free app that does it already,...)

Idea is, would be a tiny app that one would include in the same
page as the html (or a link to it anyway if it is an app.),
then you run it to search the archive.

I plan to upload it as html, and then as zips of 5 Mb files
which Joseph thought sounded a suitable kind of size for them.

Robert

🔗jpehrson@rcn.com

5/13/2001 9:22:54 PM

--- In tuning@y..., "Robert Walker" <robert_walker@r...> wrote:

/tuning/topicId_22721.html#22721

Robert!

This is really fantastic work... I'm sure everyone will appreciate
the saving and archiving of the Mills list with this one!

This is an incredibly important job, since these posts were almost
lost, save for a couple careful individuals who kept them...

Congrats on this... It will be particularly worthwhile if the entire
is somehow searchable, despite the large size!

My personal feeling is that these two lists probably constitute some
of the most vital work going on in the field of tuning today... and
your contribution is invaluable!

Thanks again!

_______ ________ ______
Joseph Pehrson