GeodSoft logo   GeodSoft

Planning the Maintenance Script

With a site maintenance script the unique page content is placed in a single table cell and all the navigation aids and standard components wraped around it. The standard components can be placed anywhere relative to the unique content. The simplicity or complexity of the standard components is limited only by your imagination and programming skill. The result has many of the advantages of dynamic pages with the speed and searching advantages of static pages.

Putting It Together

In Time to Script Page Maintenance I discussed what the script had to do but nothing about how such a script would actually work. Having done similar scripts in the past, the basics remain pretty much the same. The simplest case of this type of script puts completely standard page headers and footers on every page of a site. Keep in mind that with tables the header and footer aren't limited to a few lines but may be a complex HTML table structure with logos, standard features such as a search form, and almost any form of navigation aid located on any side or sides of the page. The unique page content will likely be enclosed in a single cell of such a table. More complex scripts will vary the page contents depending on certain conditions.

The script needs to accept two arguments: an optional parameter to tell it to process subdirectories and a list of files to process. The list of files should be both wild cardable and be able to contain multiple entires such as: "*.htm *html", "*.htm *.html, *.php", "*.htm *.cfm" or "*.htm *.asp".

The script requires a recursive function to process a directory tree if the subdirectory option is invoked. This function needs to process files and directories in a different manner. Logically it does not matter if they are processed together or separately or which are done first if separate. Perl's glob capability makes it easier to do separately. In either case each file name in the current directory that matches the expanded wild card list, needs to get passed to a function that processes a single file. Each directory that is not the current or parent directory entry needs to be changed into and the directory processing function needs to be called recursively at that point. Subdirectories need to be processed regardless of what their name is, i.e. subdirectories that are processed have nothing to do with matching the list of filenames that are to be processed.

The function that processes single files does several things. It may be designed to take raw non standardized HTML files that have only the minimal HTML elements necessary to define an HTML page and the page specific contents. In this case it needs to use standard HTML tags to decide where to put the standard contents. The obvious choices are the close BODY tag and either the close HEAD or open BODY tags. If new pages come with advertising that is to go above the standard elements the open BODY tag makes sense. If the advertising goes inside the standard contents then the close HEAD tag may make better sense.

Alternatively, the script may assume it will only be working with files that have been developed from standard templates which include contents that are close if not identical to the final results that the script will produce. This is logically similar to reprocessing a file that has already been run through the script. Clearly the script needs to create standard markers that it can use to identify the beginning and end of the areas that it is to be replacing. These markers should never be part of the visible page contents. The only reasonable choice is that the markers be structured HTML comments that will not be confused with any other page content.

At a minimum, there will be one pair of markers that identifies the beginning and end of the area to be processed. More typically there will be at least two pairs with one identifying the page top and the other the page bottom. There may be additional pairs that contain references to other content that will be include at various points within the HTML files. The references could be to text or HTML snippet files or might be keys to a database that contain page contents.

The script will read a file that it is processing and rewrite its contents until it finds a standard area marker. For each pair of markers, the script will rewrite the opening marker, then write the new standard contents through the closing marker and discard all the contents that were read from the file between the two markers. It will resume rewriting unaltered contents of the file following the closing marker until it encounters another marker identifying standard content to be replaced.

As a precaution, a script such as we are discussing should always keep a backup copy of the original input file. It is too easy for the script to wipe out the contents of the file that contains invalid HTML or has been damaged by the removal of necessary markers by manual editing or authoring tools that don't respect input source HTML (such as FrontPage). Creating backups creates a minor maintenance issue but it's nothing compared to erasing the only copy of a new HTML file someone just wrote and is much simpler than trying to validate all input files for possible errors that could cause unexpected results. I use a function that creates up to nine backup filenames before reusing a previously used backup name. When a backup name must be reused the oldest file is selected. This allows the script to be run several times with no fear of losing content. After final review of the changed results, all backup copies can easily be erased in a single operation. For maximum safety, you could wait unitl one or more backups have been performed before erasing the backups.

It's a relatively straight forward programming task to develop a standard script that could be used on a variety of sites where the markers pointed to simple text or HTML snippets to be replaced. While this could be very useful for some modest sized sites, it does not take advantage of what a script could do and provides no mechanism for significantly varying page content throughout the site without developing a significant library of snippets to be used and hard coding which snippet goes where in each file.

The power of a site maintenance script will be fully realized when the script contains the logic that defines what is common and what is variable throughout a site and also the rules by which that which varies is varied. Since human powers of visualization are fairly limited, its difficult if not impossible to visualize a site and program it without a working model of the site. Even if someone were capable of such a feat, there would be no means to communicate the appearance and working of the site to others without building it.

There are two choices. Write the script and run it on what site content is available and then progressively modify the script until the desired effect is achieved. This is likely to miss many fruitful possibilities because their actual working cannot be visualized until the programming is done and there will be reluctance to try those things which look difficult to program. It's like to result in much reprogramming. The alternative which I have chosen is to build a significant portion of a working site with traditional manual methods focusing only on what the site design functionality should be and not considering how such functionality might be programmed.

For well over two years, I've considered the issues of a script generated site map that would vary on the pages in different parts of the site. All the programmatic solutions that I have come up with have required much redundant data. While they would have been far superior to manual maintenance methods, they still would have required significant maintenance and been some what error prone. Specifically all previous solutions that I have arrived at have required placing different site maps or portions of site maps in each directory to which they would have been applicable. With a working site map in front of me, even though it's limited to one area of the site, and the ability to look at the actual HTML code, I have now been able to develop a single site wide data structure that captures all of the necessary information that program logic should be able to turn into working HTML code appropriate to the specific page in which it is located.

The issues related to the search form and powered by graphics have already been discussed been discussed in Maintaining Site Differences. Working with the previous, contents and next page specific navigation aides for this section (Making This Site) has helped me define a data structure that should allow the automated maintenance of the table of contents and the navigation aids on all pages is an area like this one. A separate text file is like a mini primary merge document in which all the constant components are contained with markers for where the variable content, the previous and next file names and descriptions, will be placed. The script reads the text file, replacing the markes as it writes the output to the updated file.

Putting It Together

After studying the existing manually created pages, has a standard page top and bottom. The top consists of two tables with several variable elements. There are eight navigation buttons which vary depending on where in the site you are. There are the search form and powered by graphics which will vary by the machine on which the site is running. There is the site map which will vary with the area of the site that you are in and the specific page you are on. In some areas there are area and page specific navigation aids. The standard bottom closes the table elements that contain the search form, site map and page specific contents and also has the copyright notice.

Perl is an exceptional text manipulation language. It's strengths come largely from its built in regular expression handling and the ability to interpolate variables inside of constant text. With the ability to print or return large blocks of formatted text, which may contain variables that contain large quantities of formatted text, its an ideal language for automating the creation of web pages and perfectly suited for solving the problem described in the preceding paragraph.

At this point the actual writing of the script is fairly straight forward. Start with any existing static HTML file that contains all the desired elements. Replace each area that changes with a variable. Read a page, and when a standard marker is found, call a function that initializes the variables in a manner appropriate to where the page is read from. Replace the old standard contents with new standard contents. Repeat until the end of the file.

The page specific navigation aids in the Making This Site area is similar to the standard top and bottom except the necessary markers will be manually placed at the appropriate location(s) inside the page specific area of each page. Some custom logic will be required for different types of insert but the same logic that handles this area should be able to handle any area that has a contents page and sequential series of pages with previous and next links.

The only thing that varies from one page to the next is the name of the file being linked to and the descriptive text for the link. The standard marker only needs to contain a pointer to a text file that contains the data that will be formatted into the navigation aids. The file consists of two parts. One identifies the contents page and lists sequentially the other pages, each with a short description and an optional long description. The second is constant text with markers for where the previous and next file names and link text goes. The constant text may or may not have a reference to the contents page; it may have references to other pages. The graphics can be switched at will and the previous and next links can come in any order relative to each other and to any other links that may be present.

A new page in such a section can be made from any other page in the section, just by replacing the page specific contents. No adjustments need to be made to the navigation aids or contents page. Only one line needs to be added at the appropriat point in the text file that contains the list of files. Page order can be changed simply by changing the order of the files in the list. After the definition file is updated and saved and the script rerun on the whole directory, the contents and navigation aids will automatically be rebuilt with the page order as defined in the list. If the individual pages also appear in the site map, they must be added to that and the script re-run for the part of the site where that part of the script map is to be visible.

A site that is maintained by such a script can have the entire site appearance changed in minutes. It does not matter, whether the script is a simple script that creates constant standard page tops and bottoms, a generalized script that works with HTML code snippets (templates) or a highly customized script with many custom code generated sections. The actual processing time varies with the size of the site, the complexity of the script and the number of pages but I've completely changed the design, and navigation aids of a 6000 page site in about 20 minutes. Days or weeks of planning may go into the redesign, but the mechanics of propagating it throughout a site are trivial with such a script

transparent spacer

Top of Page - Site Map

Copyright © 2000 - 2014 by George Shaffer. This material may be distributed only subject to the terms and conditions set forth in (or These terms are subject to change. Distribution is subject to the current terms, or at the choice of the distributor, those in an earlier, digitally signed electronic copy of (or cgi-bin/ from the time of the distribution. Distribution of substantively modified versions of GeodSoft content is prohibited without the explicit written permission of George Shaffer. Distribution of the work or derivatives of the work, in whole or in part, for commercial purposes is prohibited unless prior written permission is obtained from George Shaffer. Distribution in accordance with these terms, for unrestricted and uncompensated public access, non profit, or internal company use is allowed.

Home >
About >
Designing >

What's New
Email address

Copyright © 2000-2014, George Shaffer. Terms and Conditions of Use.