About GeodSoft: Large Web Project: List Server Issues

		GeodSoft

List Server Issues

Lessons learned setting up list servers for association members: don't bulk load lists, don't put a list server and web server on the same machine, moderate lists if you can, limit message archive size but save the messages elsewhere.

Early Mistakes
Don't Install a List Server on a Web Server
List Configuration Options
Message Archive Options

Early Mistakes

Setting up and administering thirty some member oriented list servers was probably the most eye opening experience that I've had in my computer career. We had observed the operation of the lists for a year and a half while they were on the legal publisher's site. We knew how the important list configuration options were set and what the vocal users complained about. We had never actually set up and run lists ourselves and made some important mistakes doing so.

We knew that the lists had little or no effective control limiting them to ATLA members and also that it would be some time before our own enforcement programs were ready. We made the switch to our own server not when we thought everything was ready but as soon as we thought we could provide a net increase in site functionality. This was at the point that we had list servers functioning and integrated with the web site visually and with web site security.

Our lists divided roughly into two broad groupings. On the one hand, there were several lists that had been an outgrowth from an earlier electronic BBS that ATLA had in the early 90's and that continued with modest participation levels until ATLA started its web site. The legal publisher set up a list server for each legal oriented BBS forum and used the message base from the BBS to pre load the message archives for the lists. These lists had no direct correlation with any formal ATLA group. We also had lists that correlated one for one with ATLA's Sections. There were additional lists that matched with Caucuses that were based on certain demographics and others based on some member types such as Students.

The most important decision that my boss and I made related to the new site was to pre load the lists that directly correlated with ATLA subgroups where we had the member's email address in their record. We had discussed this briefly with Membership and though my boss and I thought we knew what we were doing, the reality was that we had no idea. Once the lists were set up and loaded with email addresses, I sent a welcome message to all the lists.

I've long since forgotten the exact chronology. What I do remember was an explosion of email, especially on the larger lists some of which had more than two thousand email addresses. The email flood was mostly from unwitting list members sending their requests to get off the list back to the lists. These provoked more responses to the lists including both experienced members trying to tell others not to send these messages to the lists and others asking why ATLA was bombarding them with junk email. By the time I left work that first day, I'd turned on moderation for all lists just to stop the deluge while we figured out what to do. For several days, all I did was deal with the lists and angry members. I sent uncounted email responses to messages I recieved or that were posted to the lists and not allowed through. When things calmed down enough, I went through all the messages that had come in (and which I had saved) and sent nearly 200 apology letters to the authors of those messages that warranted one. For several weeks the lists and related issues were my primary activity.

The last time we discussed this, my boss at the time still believed that we had been right to pre load the lists despite the pain it caused us. I've had mixed feelings but having watched the lists for well over two years belive we should not have pre loaded the lists. There is little question that some of the section list servers still have members that we pre loaded and who would likely never have joined the list otherwise. Pre loading the lists also unquestionably jump started them in terms of traffic. I also know that as a professional association with a much closer relationship with many of our members than businesses typically have with their customers, a reasonable argument can be made about leading our members.

Despite these reasons, I believe the traditional Internet wisdom of never putting someone on a list they did not actively ask to be on is the better approach. All the really large lists that we loaded that had no clear subject focus have dwindled to fractions of their original size and have little traffic today even though they accounted for most of the early traffic. Two and a half years later, ATLA's two largest and most active lists have a very clear subject focus of wide interest to ATLA members and never had a single email address loaded into them by staff.

With the benefit of hindsight, I think the proper course of action would have been to go live with the web site and simply let the lists take care of themselves. The time that was spent helping confused and assuaging angry members should have been spent on building the multi list join page that was not built until much later. Once that capability was available, a single email to all ATLA members except those coded not to receive emails, could have announced the lists and given clear instructions on joining the lists the member might be interested in. No one can say with any certainty whether the lists would have been larger or smaller or more or less active taking this approach. What surely would have been true though is however active they were, the lists would have gotten that way with much less anger and frustration for ATLA members and much less stress for ATLA staff.

Don't Install a List Server on a Web Server

The other early mistake we made was to install the list server software on the same machine as the web server. Unless you are sure your lists will never have any significant traffic or simply have no other option, don't install a list server on the same machine as a web server. The most important characteristic for a web server is fast response. If you have low traffic and mostly static pages, it doesn't take much of a machine to provide this. As your traffic and dynamic page content increases your server quickly needs to become more capable.

List servers like Lyris are designed to be highly efficient engines for pumping out high volumes of email. Some list servers reduce resource use by sending the same message to many recipients. Lyris customizes every message to better control automated bounce processing and prevent anyone but the actual member from being able to change list settings for the member. Thus for every list participant with the standard mail delivery option (not digest or index) a unique message is sent for each that is distributed. The standard commercial version of Lyris will run 50 threads at a time to deliver messages quickly. The high performance version can send several times as many messages simultaneously.

One thing that is not important to a list server is fast response time. By its very nature email is a store and forward system and users don't expect messages to arrive instantaneously. Until a message comes in, list users have no way of even knowing it was sent. Once they have it they could see how long it took by studying headers. Even with a very fast server and a quiet Internet, with large lists there will be a noticeable discrepancy between when the first list participants get a message and when the last do. So as long as a list server is able to catch up with any backlogs that may develop temporarily, it's an ideal place to use an older slower machine or an inexpensive desktop instead of a high end server. If your members view the message databases as a valuable resource then you'll need to be sure you have well backed up and reliable disk systems but speed per se is not that important.

By the nature of its design what a list server will do though is use as much available CPU as it can effectively put to use while it has work to do. If it's not first limited by network transmissions and the number of mail servers it can contact, it will run a CPU at 100%. This is not what you want on a web server. On single CPU NT machines even a low volume list server can have a perceptible impact on web server response time during those periods its sending a message. This is especially true for dynamic content. Its also important to remember that peak list use time is also likely to be peak web server time. For a predominately business site, both will peak in the mid afternoon (on the east coast). So just when its starting to get stressed by its own web load, anything but a grossly over configured web server is going to get hit by transient CPU intensive processes each time a new message arrives at the list server.

On a multi processor machine the list server will have much more difficulty hogging the CPU and on UNIX and UNIX like machines, nice can be used to lower the list servers priority. NT does not have a practical priority mechanism. (I don't know about 2000.)

And finally you won't realize you have a problem until its having a perceptible impact on web performance at which time it's difficult to do anything without interfering with your web operations. See more under my discussion of ATLA's web performance surprise.

List Configuration Options

Today's more powerful list servers come with bewildering array of options. It's not always clear what some of the options are for or what the effect of different settings might be. There are some options that nearly all list servers have in common and that visibly affect the way that lists function for the users. List users are likely to have clear opinions which they are not shy about expressing on some of these. Whenever a list administrator changes a list setup in a way that is visible to list users, if the list has significant participation, there will likely be complaints. If you can, it's best to get lists set up correctly from the first and leave them alone unless something is obviously wrong.

The most important options relate to moderation. Since the "better" settings are largely a matter of opinion, I will deal with this in my book. Some of the other key options are who is the default reply to, and what message size and quantity limits, if any, are placed on the lists. If it's not an option, whether or not to allow attachments should be. For list servers that include indexing and searching options, whether or not to maintain and if so how long to maintain a list archive is a key question. The correct setting for security related options will often be obvious based on the nature of the organization and the specific list.

Despite the strength of feeling that some list users have and express about certain settings doesn't mean their choice is necessarily correct. Generally there would not be an option if it was obvious that it should always be set a specific way. When in doubt, as with most software, go with the defaults.

One option that is very visible to members and generates strong feelings is the default reply to. E-mail clients that recognize the Reply-to: header typically address email responses to the email address provided with that header rather than the From: email address. Most list servers provide control over the reply to set by the server. The main choices on a standard list server are to reply to the list name or to reply to the original author. How this is set has a very visible impact on the flow of discussions on the list. Setting the reply to the list promotes active freer discussions, increases list volume and encourages inappropriate public replies that should be private such as thank you's and me too's. Setting the reply to the original author results in lower volume, higher quality lists where discussions tend to die more quickly. It also diverts some discussions that would have been good public exchanges into private ones.

The reasons for these effects are very simple. Lists always have a significant number of inexperienced and/or lazy participants who don't think about where each response should be directed and simply send their response to the default, whatever that is. Even the most experienced users occasionally forget about the default and make a poor choice. This is one place that if you're in doubt, I'd go against the normal defaults which are typically to the list and default lists to private replies to the original author. It's very simple to send an extra reply to a list after making a private response if a list user subsequently realizes a public reply was appropriate. Once a message is sent to a list publicly, there is no calling it back.

If I were setting up new lists today, I'd seriously think about imposing a 4-6K limit per message for any list where there is not a good reason for long messages. Such a stringent size limit would stop unedited responses to digest messages and quickly growing reply sequences where none of the earlier messages headers or contents are removed. It would also stop nearly all attachments which is a choice I'd recommend if it's available as an option. There a simply too many email spread viruses today to allow attachments on any list that doesn't really need them. If you allow attachments, then you should install anti virus software that can monitor and filter an SMTP port.

Message Archive Options

ATLA's experience with Lyris message archives and indexing lead us to believe that it was not feasible to have unlimited message archives of active lists without imposing an unacceptable performance penalty. I don't know if any of the competing products do this any better but it's best to think of Lyris' searching and index as a handy extra but not a core capability of the product. That the feature lacks any boolean and proximity operators that are characteristic of most full text searching is a tip off that this is an incidental add on. The indexing algorithm is not efficient. Also, like other list server features, seaching is limited to one list at a time.

We tried keeping unlimited archives because we viewed the message database as a valuable knowledge repository of lasting value. We also had no alternative and though we discussed it on a variety of occasions, never made a commitment to provide an alternative. If the contents of the list server message bases are perceived to have lasting value, then they should be made available outside of the context of the list server software.

I'd make it a priority to put into use a program that runs in batch mode daily and dumps all new messages as straight ASCII files or any other format that your web site's search engine can easily work with. That format could be web pages created to match your standard site design. Before doing any custom programming, it's worth checking to determine if your list server already has utilities that will do some of this. Even if there are none, this is unlikely to be a significant programming task. It should take only a few hours to program Perl to split just about any consistent export format into discrete files containing a single message each.

If your platform is NT and you're using both IIS and Index Server, you'll gain an efficient indexing mechanism that only needs to deal with new content once. You also have a powerful text searching tool that makes all content to which the user has access available in a single search. If your membership access to the web site is limited based on section or other sub membership types as ATLA's is, Index Server will include all groups to which the member has access and exclude those to which he or she does not. Thus the member does not need to remember which list a specific message appeared in or repeat searches. For this to work, the messages from each list need to be saved to directories to which only those eligible to use the specific list have access.

Index Server or any other full text searching tool is far better at allowing users to formulate queries that let them find and retrieve the documents they are looking for than Lyris' limited search capabilities. Other list servers are typically no better in their search options than Lyris. On the other hand, nearly all web search tools provide "and" and "or" operators and parentheses. Many provide proximity. Most provide exact phrase search capability. Those that lack these types of capabilities, such as Excite, try to search for meaning or provide relevant documents using techniques quite different than traditional boolean text searching.

Once such a mechanism is in place and messages are exported from Lyris or other list server daily, indexing may be eliminated from the message archives. It might also be left on as convenience for users trying to find a specific recent message. With a superior alternative to the list server for searching and archiving, I'd reduce list server internal message archives and indexes to something between two weeks and two months.

There may be legal issues with creating a message archive outside the list server software. If the messages contain original creative content, then the message authors have copyright to that content. Regardless of the terms of use that may or may not be associated with list server use, you can easily make the argument that anyone posting messages to a list server, has by their action given implicit permission to use the messages within whatever facilities the list server provides. Unless the terms of use on the list server are very clear, and provide for placing the messages in other archives or cover this through broader grants, the association may not have the right to move list server messages to other archives. At a minimum, a member could request that messages posted prior to such policies be removed from any external archives. If the messages could be shown to have commercial value that has been reduced by being in the archives and the author did not grant permission for this, a law suit is a possibility.

<P List Server Integration <
^ Large Project Contents ^
> Document Sharing N>

Top of Page - Site Map

Copyright © 2000 - 2014 by George Shaffer. This material may be distributed only subject to the terms and conditions set forth in https://geodsoft.com/terms.htm (or https://geodsoft.com/cgi-bin/terms.pl). These terms are subject to change. Distribution is subject to the current terms, or at the choice of the distributor, those in an earlier, digitally signed electronic copy of https://geodsoft.com/terms.htm (or cgi-bin/terms.pl) from the time of the distribution. Distribution of substantively modified versions of GeodSoft content is prohibited without the explicit written permission of George Shaffer. Distribution of the work or derivatives of the work, in whole or in part, for commercial purposes is prohibited unless prior written permission is obtained from George Shaffer. Distribution in accordance with these terms, for unrestricted and uncompensated public access, non profit, or internal company use is allowed.

Home >
About >
Large Project >
listissues.htm

<P List Server Integration
N> Document Sharing

What's New
How-To
Opinion
Book

Email address

Site map:
Home
What's New
About GeodSoft
- Large Web Project
- - Historical Background
- - Standardization
- - Selection of NT
- - List Server Selection
- - Transition to NT
- - Security and Members
- - Integration Issues
- - List Server Issues
- - Document Sharing
- - Electronic Contracts
- - Performance Surprise
- Designing GeodSoft
- Building GeodSoft
How-To
- Good Passwords
- 10 Security Steps
- Intrusion Detection
- Secure Shell (SSH)
- Harden OpenBSD
- Dual Boot Open Source
- Style Sheets (CSS)
- Time Synchronization
Reviews & Commentary
- Product Packages
- Software Licenses
- Open Source Limits
- Server Comparison
- Bogus PHP Bug
- Corel Linux
Book: Assn. Webs
- Introduction
- Network & Web Basics
- Assn. Computer Security
- Publishing Myth
Terms of Use
Privacy Policy

Password Generator
Password Evaluator
Crack Time Calculator