Earthworm Modules:
Startstop Overview
(last revised 22 Feb, 2007)
This program starts and stops an Earthworm system. It reads its configuration
file which specifies the message transport rings to be created, which modules
are to be run, and the names of the parameter files each module is to read on
startup. The program is system dependent, and there are versions available for
the SUN Solaris and Windows NT operating systems.
For startstop to work, it must know about the Earthworm environment. This
is typically done by setting the environment variables within the environment/ew_*
file specific to your platform, and then sourcing that file, or executing the
cmd if you're on Windows. Startstop typically reads its configuration file from
the EW_PARAMS directory (as defined in your environment) and creates the specified
rings. It then starts each module as a child process, passing its configuration
file name, and any other parameters as its command line paramters (argv, argc).
Each module (child process) is started with the priority indicated in startstop*.d.
Note that each module and each ring specified must be definined within earthworm.d
or earthworm_global.d, which should be in the EW_PARAMS directory. The system
continues to run until "quit" is typed in startstop's command window. Startstop
then sets a terminate flag in each transport ring. Each well-behaved module
(child process) should periodically check for the terminate flag, and exit gracefully
if is set.
Note that two copies of startstop pointing at the same startstop *d file are
not allowed to run simultaneously. The second one started will fail and quit.
(If you really want to do this for some reason, you'd need to make sure that
you use all different rings in the second version, different ports for the modules,
and a different startstop*d file, specified as a parameter when starting startstop.)
If the user presses the "Enter" key while the startstop command window is selected, startstop will print a status table showing various statistics for each module, including whether it is dead or alive.
Startstop will also react to 'restart' messages from statmgr. This is part
of a scheme wich works as follows: A module may have the token "restartMe" it
its .desc file (the file given to statmgr, which tells it how to process exception
conditions from that module). If its heartbeat ceases, statmgr will send a restart
request to startstop. Startstop will then kill the offending module, and restart
it with the same arguments as it did at startup time. There are some system
specific features, listed below:
Interactive commands:
Startstop will repond to the following commands from the status console window.
There are similar command line versions of each command as well.
- restart <pid> or restart <module name>
- Startstop will send the module a message to exit, and may try and kill
it if it doesn't quit by itself in a certain period of time. Next startstop
will attempt to start the process back up.
- Note that the <module name> must be unique for this to work as
an argument. The command line version can only accept the pid (Process
Id) as an argument.
- stopmodule <pid> or stopmodule <module name>
- Startstop will send the module a message to exit, and may try and kill
it if it doesn't quit by itself in a certain period of time. Startstop
will not try to start the process back up, and statmgr shouldn't try to
restart it either.
- Note that the <module name> must be unique for this to work as
an argument. The command line version can only accept the pid (Process
Id) as an argument.
- Within startstop, this can be abbreviated to just "stop <pid>
or stop <module name>".
- The command-line "stopmodule" should mark the module as intentionally
stopped, showing up as "Stop" in the status listing. This differes
from the command line tool "pidpau" which will simply kill a
module. It won't be marked as "Stop" so if statmgr is set to
monitor and restart this particular module a process killed by "pidpau"
will get started back up again. A module stopped by "stopmodule"
should not.
- The module is stopped only for the duration that this startstop session
is running! If you want to permanently stop a module, you'll also want
to remove it from the startstop*d, and the statmgr.d files so it doesn't
get started up next time around.
- reconfigure
- Startstop will re-read the startstop_nt.d, starstop_unix.d or startstop_sol.d,
and allocate any new rings and start up any new modules it finds in the
new .d file. In the process it rereads the earthworm.d and earthworm_global.d,
in the event that there have been new module IDs or new ring IDs added
there.
- As the final reconfigure step, statmgr is restarted as well so it re-reads
it's config file. Any modules that were added to startstop*d should be
added to the statmgr.d config file as well.
- The command line version does the same thing.
- Within startstop, this can be abbreviated to just "recon".
- quit
- Starstop will send all child processes (modules) a request to quit,
and will kill them if they don't quit within 30 seconds or so. It will
then shut itself down.
- The command line equivalent to "quit" is called "pau".
Solaris, Linux versions:
- Solaris startstop reads a configuration file named 'startstop_sol.d'
- Linux startstop reads a configuration file named 'startstop_unix.d'
- If a child process does not exit within a user specified time after the
user types "quit", startstop terminates the child process.
- The amount of CPU time used by each child process is listed in the process
status table.
- As of Version 3.0, Startstop can run in background. This modification was
made by Pete Lombard at the University of Washington. Instructions
- To run Earthworm as other than root, you must set the file charateristics.
Instructions
Windows, Windows Service version:
- Windows startstop and Windows startstop service read a configuration file
named 'startstop_nt.d'
- If Windows starts up, and, for example, the binary executables for certain
modules are missing or are misnamed, startstop will start up anyway. These
processes will be shown with a nonexistent negative process ID, and "DOA"
as their status. If this process is restarted once the problem that caused
the error has been fixed, the process ID will return to a normal ID, and the
status will change to "Alive".
- Startstop can be set to start
automatically when Windows boots up, but probably better than doing that
is to set startstop
as a Windows service. Note if you set Startstop as a Windows service you'll
need to use other command line utilities like 'status' and 'restart' to monitor
and control earthworm modules since there's no interface to the Startstop
service. You can run StartstopConsole
in order to be able to connect to the session running earthworm, if you're
not logged in as administrator. You'll be able to start and stop Earthworm
with the Windows Services Control Panel.
Module Index |
Windows Commands | Solaris Commands
| Linux Commands
Questions? Issues? Subscribe to the Earthworm List (earthw).