Help on Mail to Web Setup
For help on Mail-to-Web usage, check out
http://www.slb.com/admi/util/help/help_on_using_mail_to_web.html
Here are the notes required to do a setup of mail-to-web on your
server. This setup assumes the worst case we support of a multiple
web server on a single machine with a setup that follows our
directory setup exactly. In cases where the structure does not match,
it is likely possible to get it to work, but you will need assistance
from Corporate Houston Web Information Group (likely).
- Setup Instructions
- Description of the Various Processes
- Control Files
- HTML Data Can Be Included
- Attachments Can Be Used
- Mailkey Can Be URL
- Area Indexformat.html Controls are Editable
- Default Mailkey for each Area
- Node List Can Skip Area if areatitle is
Blank
- Define a unique "webkey" of 4 letters for the server internal name.
This needs to be unique over all Corporate WebMail servers - usually the first
4 letters of your real server name will do. Other webkeys already in use
are "omne", "pogo", "nyc", "it" (should become "itse"), "slbp".
For the following instructions, we use srv1 as the server name
- change to your own.
- Setup a home area for srv1 with a copy of .cshrc from /ftp/pub/bin
- Setup a /etc/passwd file entry for srv1 and a group entry in
/etc/group for srv1g ("srv1g") that it belongs it. It must also
be a member of the "web" group.
- Web runs as user srv1.
- Web area super-user assumed to be web
- Web user can write over any portion of the web and the other
user's group permissions and default UMASK values (recommend 002) allow
this overwriting.
- Create /ftp/web/build/webmail as a build area (owned by
web.
- Copy in the latest omnebuild.tar into this area.
- Untar it as user web (tar -xf omnebuild.tar)
Automatic Setup
The following approach uses the big hammer approach. If you already have
a full setup with some special modifications, this may overwrite it.
But, by design, it is supposed to add the needed files and no overwrite
those files that have changed since these files were last created.
But, FOR SURE, there could be problems. On the "good" side, it often
works!.
This must be done as root.
#cd /ftp/web/build/webmail
#cd bin
#./buildsite xxxx xxxx must the name of your web user - srv1 in example
There should be no errors reported, but if you have some special files,
it may note that it was unable to overwrite newer files on your server.
If you want the full build exactly as designed, you will need to delete
those shown as not being replaced and then run buildsite again.
Once this has been done successfully, you may need to edit the local
configuration files in /ftp/db/conf/xxxx (srv1 in this example) to
have the right defaults for your new server. If you are doing a
re-install, then you may need to compare /ftp/db/conf/srv1/back/*.html and
/ftp/db/conf/srv1/*.html to see what needs changing.
Manual Details
This section can be ignored if the automatic buildsite worked without
errors. But, it may help to step through it to understand the setup
built by buildsite.
- Web main branches should be set to "g+s" permissions to allow users
to write files with group ownership remaining as the web user.
- General Script area for the web - /ftp/pub/bin
- Server names are srv1 (and/or "srv2" or any other)
for the purposes of this help file
- CGI scripts are in /ftp/web/cgi-bin/srv1 and
/ftp/web/cgi-bin/srv2 for most CGI scripts. The area
/ftp/web/cgi-bin is reserved for shared scripts but they cannot
be run directly from that area as it is not allowed by the web server
setup (assumed).
- Server document roots are at /ftp/web/srv1 and
/ftp/web/srv2.
- Server cgi roots are at
/ftp/web/cgi-bin/srv1 (and/or /ftp/web/cgi-bin/srv2).
- The following files exist in the areas noted (but just copy everything,
as some new files may be missed here):
- /ftp/db/regi/server.html
web_config.html - List of valid Corporate web servers for lookup by
IP/DNS to webkey - 1 such file per machine, not per webserver
implementation
- /ftp/db/conf/srv1/indexformat.html
web_config.html - All server control parameters
- /ftp/db/regi/srv1 - full web structure containing all .dbm files for
any web area (usually htmlelement.dbm, indexformat.dbm, and
indexsummary.dbm per web area). Structure is build dynamically when
needed.
- /ftp/db/acl/srv1 - Contains the .dat and .dbm files used to control
Access Control Lists for the server
- /ftp/db/dpar/srv1 - System builds one .dbm and .dat file for each
control parameter that is found in any indexformat.html file. Will be
used when someone needs to know all values of a particular parameter
across the web structure.
- /ftp/db/excite/srv1 - full Excite indexes are built here (can be
large if multiple indexes built or many files included in the web).
- /ftp/db/keys/srv1 - webtable.dat and webtable.dbm files are build here
- /ftp/db/ldap/srv1 - LDAP data structure
- /ftp/db/link/srv1 - full web structure built for any xxx/image/xxx
file that is found as a link in some other file on this web
(/ftp/pub/bin/omnelink is usually run nightly - builds this area)
- /ftp/db/mweb/srv1 - Original emails are kept here in a structure by
date - each day's emails are kept in a raw form in a directory per day.
- /ftp/db/ofid/srv1 - Large .dbm files containing all summary data
for each file on the server.
- /ftp/db/ph/srv1 - ph data structure
- /ftp/db/serv/srv1 - area used by Netscape web server
- /ftp/db/back/srv1 - full web structure containing backup files for
each area (if any) - structure is built dynamically when needed
- /ftp/db/arch/srv1 - full web structure containing archived files for
each area - structure is built dynamically when needed
- /ftp/pub/bin/omneinit.pl - Support for general CGI scripts
- /ftp/pub/bin/webmail.pl - Does the mail-to-web front-end (buffering)
- /ftp/pub/bin/wm_func.pl - Does the mail-to-web conversion
- /ftp/pub/bin/text2html.pl - Does the HTML conversion
- /ftp/pub/bin/htmlmail.pl - Supports mailing of general HTML and other files
- /ftp/pub/bin/makeifcache.pl - Used to deal with Corporate DBM files
- /ftp/pub/bin/makeifnews.pl - standalone builds news files - future
- /ftp/pub/bin/makewebtable - Builds the system mailkey lookup file
- /ftp/pub/bin/omime.pl - Caller of the C routines for MIME decoding
- /ftp/pub/bin/omime - Executable C routines
- /ftp/pub/bin/omnefile.pl - Includes Corporate-specific file routines
- /ftp/pub/bin/mailform.pl - Used to handle email lookup matching
- /ftp/pub/bin/styleform.pl - Used to present the list output in makeindex
- /ftp/pub/bin/backup.pl - Used in future backup file management
- /ftp/pub/bin/session.pl - Used in future session key maintenance
- /ftp/pub/bin/cgiwebkey.pl - Determines the user's webkey
- /ftp/pub/bin/dbJob.pl - Handles building DBM's as used by system
- /ftp/pub/bin/cacheData.pl - Handles building caches for makeindex
- /ftp/pub/bin/tableDBM.pl - uses dbJob to build an indexed table for caching
- /ftp/pub/bin/wrapper.c - source for the general wrapper function
- /ftp/pub/bin/wrapper - compiled re-namable executable wrapper.c
- /ftp/pub/bin/
- /ftp/pub/bin/buildsummary.pl - Builds the summary data for makeindex
- /ftp/pub/bin/returnemail.pl - used by mailform for mail distribution
- /ftp/pub/bin/ofidcheck.pl - Used to build the OFID dbm lists
- /ftp/pub/bin/word2x - Word document conversion for older Word documents
- /ftp/pub/bin/field.pl - Support for general CGI scripts
- /ftp/pub/bin/omnefile.pl - Support for general CGI scripts
- /ftp/pub/bin/makeindex.pl - Builds the file index after any new addition
- /ftp/web/cgi-bin/updateindex.pl - CGI call to build new index
- /ftp/web/cgi-bin/genhandler.pl - CGI call to handle generic forms
- /ftp/web/cgi-bin/metagenhandler.pl - CGI call to handle generic file creation
- /ftp/web/cgi-bin/paramedit.pl - CGI call to handle file editing
- /ftp/web/cgi-bin/createarea.pl - CGI call to build a new list area on web
- /ftp/web/cgi-bin/at_search.pl - Excite search interface
- /ftp/web/cgi-bin/move302.pl - Handles redirects to other web pages
- /ftp/web/cgi-bin/next_file.pl - handles button to jump from one file to next
- /ftp/web/cgi-bin/prev_file.pl - (same) handles button to jump between files
- /ftp/web/cgi-bin/viewindex.pl - used to show the updated local index
- /ftp/web/cgi-bin/server.html - list of servers and their keys for lookup
- /ftp/web/cgi-bin/webtable_$webkey.dat - text lookup of mailkey to
filing area (may move)
- /ftp/web/cgi-bin/web_config_vast.html - server configuration file
- /ftp/web/cgi-bin/indexformat_vast.html - system level default format file
- /ftp/web/bin/makeindex.pl has a C-wrapper (copy of above
wrapper) that is installed in /ftp/web/bin/srv1 area for each server. The
permissions for each copy of this wrapper is set to 6775 (SETUID and SETGID)
- /ftp/pub/bin/webmail.pl has a C-wrapper (copy of above
wrapper) that is installed in /ftp/web/bin/srv1 area for each server.
The permissions for each wrapper are then set as required for use on
that server. The /etc/mail/aliases file point to each of these wrappers
for each server.
- /ftp/pub/bin/omime - a compiled C program - exists and
is runnable
- Server web area consists of Node, List, Private, and Image areas that
are arranged under (above?) the root with no user data files in
Nodes, only other Nodes and Lists subdirectories in a Node, no Nodes
in a List area, and each List, Node, or Private area has an Image
area underneath - a terminal directory in the tree). In addition,
each Node or List area contains one each of arch.d, back.d and "image"
subdirectory underneath. Private areas (under a Node) have no rules
but also no resultant structure.
- There must be an indexformat.html in any area that is planning
to change any of the formatting rules inherited from areas closer to
DOCROOT (or finally from the
/ftp/db/conf/$webkey/indexformat_$webkey.html default).
- Each List or Node area must contain:
- An indexformat.html file (optional if all inherited params are OK
for this area as well) set-up as needed for the area (AreaType
parameter must be set to "list" or be blank - default -).
- If the area above this area has an indexformat.html file and will
then get an automatic list of the areas below, then the areatitle must
be filled in this indexformat.html (or it will not show in the node list).
- An image area (commonly image). This area will receive the
MIME portion of any mail message. If not found, the MIME data will be
discarded. Plus any other non-HTML files are stored here.
- An image area for all non-HTML files
associated with the area.
- An archive area (commonly arch.d - no option on name really).
This area is used for
removing old data from the web area - anything in the area will not be
searched nor available via the normal links but may be found by specifying
the arch.d on the URL directly by users.
- A backup area (commonly back.d - no option on name really).
This area is used for
making temporary backups. Anything in this area can be deleted (and
soon will be) one month after it was put there. The filename of each
file added to this are should include a final key that includes the
date code of when it was moved there as the first 3 number/letters.
- You can run makeindex -w srv1 from within
the area (or pass the full path of the area to be built as the next
argument to makeindex) before using mail-to-web. Ensure that web
user has permissions to write over what you have created (your UMASK should
be 002 in your .cshrc or .tcshrc setup files). If you are creating files
that have no group read and/or write permissions, none of the utilities
will work correctly. The World/Other bits can be non-writable, but they
must be readable by the web (it runs as nobody).
Mail arrives to be put on server
Omnes Excite Search Setup
Structure
Our version of Excite (substantially modified) requires the following
setup (again, xxxx is the server WebKey)..
- /ftp/web/excite - contains all program and setup files
(owner web)
- /ftp/web/excite/image - Needed images for the search use
(owner web)
- /ftp/db/excite/ - base collection point
(owner web)
- /ftp/db/excite/xxxx - base collection point for each webserver
(owner (strong>$webkey - the web user for that web)
- /ftp/db/excite/$webkey/collections - place where collections are
created
for each server
- /ftp/web/xxxx/admi/util/sear - contains various needed web pages
as built by excite. (owner web)
- /ftp/web/cgi-bin/at_search.pl (or at_search?) for the web interface.
- Usually, /ftp/web/$webkey/gen/general_search_page.html or similar
file
that contains a comprehensive main search page (as found by choosing
some search icon on front page).
- Generally, a search banner is available at the top of each generic
list area or node area presentation. These have n added set of search
terms that help to focus the search to data within that area. This
is done by having the keywords for the area be included automatically
in the search terms used when something else is filled in.
Setup
Untar the omnebuild.tar file in "/ftp/web/build/webmail/".
Copy the given slbp.conf to srv1.conf and slbp.cf to srv1.cf for your
srv1 server setup (in /ftp/web/excite). Copy do_slbp.pl to
"do_srv1.pl" and ensure execute privileges are set.
When initially installing a server, create the areas listed above and
set the ownership listed. All files in /ftp/web/excite are owned by
web except for srv1.conf and srv1.cf which must be owned by the
web owner (srv1) (I believe it will work if they are owned by
web as long as srv1.cf does not need to be rewritten
by the system - should not...).
Copy admi/util/sear area, data, subdirectories, etc into your server at
/ftp/web/srv1/admi/util/sear (owned by web). No writing will take place
in this area by the web server.
Setup the /ftp/db/excite/srv1 and /ftp/db/excite/srv1/collections area
(owned by srv1).
Change to user srv1 and cd /ftp/web/excite.and then
execute do_srv1.pl . It should build your indexes if you have data
at /ftp/web/srv1 .
The files to be found in /ftp/web/excite are (serv1 is your WebKey):
- srv1.conf - Editable IFF you need something different than a
collection
named according to your webkey
- do_index.pl - generic index building - use instead of do_srv1.pl as long
as you do not need to set a different indexing point or a name for the collection
that is different than the webkey for your webserver.
- do_srv1.pl - As above, should work as is (just renamed to use your webkey
instead of srv1). Note that you have to run this while logged in as user srv1.
- srv1.cf - Could be created by Excite, but do your own - rest is not
supported anymore. Just change the "omne" references to be your own srv1
- filter.txt - should be left as is initially - removed *.d files and indexABCD.. files from the files considered
- stem.tbl - Leave as is unless you know more than we do
- stop.tbl.key - Leave as is unless you know more than we do
- stop.tbl.ptr - Leave as is unless you know more than we do
- exclude - Leave as is unless you know more than we do
- include - Leave as is unless you know more than we do
- aindex.pl - Perl scripts used to wrap architext_index binary
- aquery.pl - Perl scripts used to wrap architext_search binary
- architext.pl - Rest of the Perl code used to interface binaries
- architext_conf.pl - All Corporate mods to the code were in these perl files
- architext_map.pl
- architext_notify.pl
- architext_query.pl
- afeatures.pl
- os_functions.pl
- architext_index - binary indexing executable - cannot be changed
- architext_search - binary search executable - cannot be changed
- at_search.pl - Copy of the program that must be put in /ftp/web/cgi-bin
No files are needed under /ftp/db/excite/srv1 but the directory needs
to be there for each webserver (plus /ftp/db/excite/serv1/collections)
and need to be owned by the web area owner.
Copy at_search.pl above to /ftp/web/cgi-bin area for use by web forms.
IF you hare using a standard setup, then you can ignore do_XXXX.pl and just
use do_index.pl. Copy it (or check that some previous web build has already
done so) to /ftp/web/bin/do_index.pl and ensure that it is executable (775).
This shared version will build an index named for your webserver when you
run it as your web server's user (XXXX).
IFF you have modified do_XXXX.pl to be specific to your needs and it is no
longer the generic version that indexes all your webserver, then you must
copy do_XXXX.pl to /ftp/web/bin/XXXX/do_XXXX.pl (replace XXXX with your webkey).
Ensure that it is set to be executable (775).
Use "crontab -e" (when running as XXXX user) to add the line
(depending on whether you can use the generic version of you must use the
version specific to your webserver):
50 00 * * * /ftp/web/bin/do_index.pl
or
50 00 * * * /ftp/web/bin/XXXX/do_XXXX.pl
Excite Bugs - How to Survive Indexing
The excite indexer has some severe bugs. All (except 1 to date) can be
worked around iwth knowledge. Here is our best knowledge on how to
deal with them.
To date, we have made indexing work on up to 200000 files.
When over roughly 50,000 files, the "Search by Example" buttons
fail to work due to some internal limits being exceeded in the binaries.
Even small collections can fail during indexing due to "empty" files.
Note that "empty" needs description - files of 0 length (left over junk)
are "empty" and should be deleted in any case as they are not userful on
a web. But, note that any file that contains no text other than in
HTML constructs is "empty". All that is needed is one \
somewhere in the file.
Debugging for files of this type requires either a special script that
searches for such files (usually found in frame implementations) or
multiple trial indexing passes. There is a log file (in
/ftp/db/excite/$webkey/*.log) after any excite indexing has run or failed.
The last line shows the file that caused the crash during indexing.
If you edit this file, you will find that it contains no text other than
the HTML constructs (often a table with only graphic buttons). Just
add an \ before the ending </html> (or before the last
</body> if used as it should be). Then rerun the index
(/ftp/web/bin/do_index.pl on most Corporate machines). When it fails again,
repeat until it completes successfully.
WebMail has some serious control file areas (for each xxxx server).
- ..In all below, anywhere you see "$webkey", put in the key for your server
- /ftp/db/mweb/$webkey/ - used to archive each incoming mail message
under a year/mont/day directory structure ($webkey99/$webkey9903/$webkey990329)
with one area per day. Each filename is that file's OFID (Omnes File
ID) that is unique within the realm of all Corporate Web Servers.
- /ftp/db/keys/$webkey/webtable_$webkey.dat is the lookup for mailkeys
to web areas.
- /ftp/db/keys/$webkey/webtable_$webkey.dbm - the fast cached version
of the webtable.
- /ftp/db/acl/$webkey/indexacl_$webkey.dbm - fast cached version of
all collected access control queries from each area's indexformat.html
files (collected by /ftp/pub/bin/makeaclcache.pl routine). This DBM
is Netscape's Web Server's only way of interfacing with the web controlled
access for each area on the web.
- /ftp/db/conf/$webkey/indexformat_$webkey.html - System format defaults for
the server. This file contains all parameters that are used in each
web area's indexformat.html files. If one is missing from a local version,
this area's parameter is then used as the default (inheritance).
- /ftp/db/conf/$webkey/indexformat_$webkey.dbm - fast cached version
of above indexformat_$webkey.html parameter file.
- /ftp/db/conf/$webkey/web_config_$webkey.html - Server setup
parameters
for each server.
- /ftp/db/conf/$webkey/web_config_$webkey.dbm - fast cached version
of above web_config_$webkey.html control file. Note that this file must
be there or else the Netscape Web Server will be unable to know anything
about this web server's setup when trying to do authentication.
- /ftp/db/ofid/$webkey area contains the DBM files used for all summary
caches used to speedup up building of indexes and for running the
Omnes News Service. The files can be completely built from the system
by running /ftp/pub/bin/xxxx - this will take 3-5 days on ohwig01.
It does not start from scratch but updates the existing DBM data.
- /ftp/db/ph - contains the ph server used for authentication.
- /ftp/db/ldap - contains LDAP data for LDAP server
As detailed above, the mailkey can be replaced with the actual URL path
(without the "http://IP.Address.Part"). In the example above, "/serv/gen/"
(note the starting and ending "/") can be used in place of servgen.
If the keyword has not yet been defined in webtable_$webkey.html, but the
area is setup with the right permissions and indexfiles, then the URL path
can be used intead.
From each list of files (indexed list), there is a button with a Vulcan
ship or a sideway's "3" that will allow you to enter a password for the area
(the mailkey for the area is already filled in) which will then show the
settings in the indexformat.html file and allow each of them to edited
(for most of them - some are not set to be editable as yet).
Each area automatically has a mail that is the full URL path without
the unix directory slashes. In other words, serv/proj/repo would
get an automatic mailkey (beyond any specified in the indexformat.html
file) of servprojrepo. This is only true for areas that have at
least one legal mailkey specified in the indexformat.html file.
No longer true? IWF
If you do not fill in some non-blank data for the areatitle parameter in the
area's indexformat.html file, then the node list above that area will skip
that area in the list. This can be used to advantage to create areas that
can be found by setting the URL directly but cannot be seen on the automatic
list above. For example, one can set up dum1 and dum2 directories that are
invisible until the indexformat.html file for the area is updated with a
non-blank areatitle entry.
http://www.choral.org/admi/util/help/.html

fetterley@houston.sns.slb.com