mon News


The latest is mon-1.0.0pre4 as of 3-Aug-2004, and is available from from the usual location. This should be used only with the mon-client-1.0.0pre2 perl module.

Also check out Ryan Clark's CSS modifications to mon.cgi, available from here.


The code for mon and Mon::Client has been imported into CVS on sourceforge. See the development page for more information.


The following is the changelog entry for the most recent devel version:
Changes between mon-0.99.2 and mon-0.99.3-46 (devel version)
Fri Jun  4 13:16:23 EDT 2004
-updated lpd.monitor

-added "watch" parameter to monshow 
 submitted by Joe Rhett 

-xedia-ipsec-tunnel.monitor now understands the new
 OIDs for sysObjectID.0 in the newer versions of the software

-fixed exclude_period parsing problem reported by Konstantin 'Kastus' Shchuka 
 and Jeroen Moors 

-fixed a setlogsock problem reported by Gilles Lamiral 
 added AIX to the systems which require setlogsock

-added "clientallow" restriction (trockij renamed it that from serverallow)
 by Ed Ravin 

-added monfailures to clients directory, contributed
 by Ed Ravin 

-patch to fping.monitor which catches more error messages from fping
 by Ed Ravin 

-patch for minor *bsd startup nits
 by Ed Ravin 

-patch to msql-mysql.monitor to support the more typical summary/detail output .
 by Ed Ravin 

-patch to phttp.monitor which corrects the uninitialized variable error
 by Ed Ravin 

-patch to phttp.monitor to show more detail in regexp failures
 by Erik Inge Bolsų 

-patch to imap.monitor to report the usual summary followed by details,
 and clarify some error messages for a couple of situations
 by Ed Ravin 

-adjust for some current fping output (ICMP host unreachable),
 correct the docs for failure_interval (which is currently listed as a period
 def rather than a service def)
 from Debian users, submitted by Roderick Schertler 

-another patch to fping.monitor to catch ICMP Time Exceeded failure,
 submitted by John Nelson 

-MON_DESCRIPTION now supplied to monitors

-added "-f" to etc/S99mon

-taint fix for perl 5.8 in monshow from Roderick Schertler 

-added trace.monitor, and alternate route path monitor

-changed ftp.monitor to detect no ftp server when socket opened okay.
 submitted by Dan Kendall 

-updates to ftp.monitor to show detail of ftp conversation

-added irc.alert

-added dns-query.monitor

-mysql.monitor - fix for deprecation of _ListTables
 by Aled Treharne 

-updated smtp.monitor to output detail

-added "version => 2" to monitors which use
 the net-snmp module so that they work with net-snmp 5.0.6

-minor documentation updates

-fixed a bug with the CGI invocation of monshow which would
 yield the error message "premature end of script headers" when you "drilled
 down". bug reported by Hugh Caley 

-mail.alert includes the service description in the body

-fix for alertafter timer, fix for upalertafter feature
 sent by Adrian Chung 

-fix phttp.monitor for RFC compliance, uses \r\n everywhere in its requests.
 Just \n leads to a "400 Bad Request" on IIS 6.0 in native mode.
 sent by Erik Inge Bolsų 

-fixed reboot.monitor and asyncreboot.monitor to handle counter roll-overs

-_upalertafterinterval typo fix from Michael Rademacher 

-fix to phttp.monitor for EINPROGRESS from Erik Inge Bolsų 

-updated file_change.monitor from Jon Meek 

-dtlog a bug where blank lines from the dtlog are being output to the 
 client, and the client is interpreting the timestamp as zero. fixed by
 David Nolan 

-fixed qpage.alert:
    Only the first pager gets notified when there is more than one listed for a qpage.alert.
    The problem is that qpage returns 0 for failure and 1 for success which is
    backwards from what the alert routine thinks will happen.
    submitted by 

-updated nntp.monitor to support authentication
 submitted by Kai Schaetzl/

-unbuffered monerrfile, maybe it'll work

-fixed trap auth problem, parsing bug
 submitted by

-updated mon.8 to explain how to set environment variables
 for each service to be passed to monitors and alerts. also
 removed the wording that the client handling is iterative
 (it is not).

-updated netappfree.monitor, submitted
 by Ed Ravin 

-patch to fix broken upalerts
 submitted by Daniel Fenert 

-patch to dns.monitor for added functionality
 Added -serial_threshold command line argument to allow the zone serials
 between the master and the slaves by that much, at most.  Necessary to
 avoid spurious errors during zone propagation.  High thresholds are
 typically unnecessary, but when using Dynamic DNS, with zones that update
 hundreds if not thousands of times an hour, they can be off by quite a bit
 but still be OK.  If propagation completely fails, eventually we'll exceed
 the threshold.
 Added a mode for monitoring caching only name servers.  Give the
 -caching_only argument, and then instead of -zone and -master arguments,
 you specify -query arguments, which are of the form record[:type].  (With
 A records being the default type)  So you might specify '-query -query -query'
 Every server will be queried for each request, and must return a valid
 response.  But the records will NOT be cross checked against each other,
 as various round-robin DNS situations may cause the different servers to
 have different data.
 Fixed some error reporting code to  the output better
 Changed the script exit value to be the highest count of how many servers
 failed on a single query.  (I.e. if three servers are queried, for 20
 records, the highest error code possible is 3, not 20 as it was before)
 I found all of these changes to be necessary in our environment, and none
 of them greatly change the original behavior, so I figured they were worth
 submitting.  I would just submit a diff, but a context diff was actually
 BIGGER then just sending the whole file...

 submitted by David Nolan 


There are now two separate releases of mon being maintained, the "stable" version (very conservatively released), and the "development" version (very liberally released). The "development" version will almost certainly be newer, buggier in some ways, less buggy in other ways, and maybe more featureful than the stable release. For an explanation of the naming conventions and their respective locations, see the devel page.


Added Nate Campi's demo mon.cgi interface to the "about" documentation
on the web page.


Thu Feb  7 09:17:25 EST 2002 trockij

updated some stuff on the web page.

Changes between mon-0.99.1 and mon-0.99.2
Sat Sep  8 10:06:01 EDT 2001

-fping.monitor reports the error when it gets a return value
 from fping which it doesn't recognize. this could have been
 the cause of some phantom alerts reported w/empty summary

-fixed comments in CHANGELOG

-andrew ryan patch to fix checkauth and some monerrfile fixes, theo's
 fix for alertevery. this fixes the "cannot connect to mon server"
 problem with mon.cgi.

-andrew ryan patch to open/close dtlog for each entry, renamed open_dtlog to init_dtlog



Changes between mon-0.38.21 and mon-0.99.1
Sun Aug 19 15:18:55 EDT 2001


the following two defaults were changed, since they seem to be unintuitive
to most people, based on feedback given on the mailing list.

   -the old "comp_alerts" is now the default. to get the old
    behavior, specify "no_comp_alerts" in the period section.

   -the default is now the old "summary" behavior for alertafter. that
    means that for successive failures with "alertafter" used to suppress
    multiple alerts, only the summary line will be used to short-circuit
    the alert suppression.  to get the old behavior, append "no_summary"
    to the alertafter line. the old "summary" syntax is still permitted
    to help w/backwards compatibility.


-cleaned up config parsing a bit

-updates to up_rtt.monitor, added traceroute.monitor, smtp3.monitor,
 http_tpp.monitor, file_change.monitor

-fixed problem where upalerts were not sent for ack'd failures

-updated the sample etc/ to give examples of trap authentication

-updated man page for mon to include better explanation of syntax.

-formatting updates to monshow, added "summary-len" option, html

-fixed problem where server responded twice with an extra "220 ok"
 after doing a reload

-rewrote fping.monitor to return more verbose output, and to sort
 the failed hosts on the summary line. this was wreaking havoc
 with "alertevery", since the order of the failed hosts in the
 summary might change, even though the same hosts were failing on
 successive tests. added "-s ms" option which will consider hosts
 with a response time greater than "ms" milliseconds as failures.
 added  "-a" option to fail only when all hosts fail, and "-T" to call
 traceroute on each failed host. "-h" lists options.

-made nearly all monitors output their summary line (if it is a list
 of hosts) in sorted order.

-updated man page for mon with more detail on the behavior of "alertevery"
 and "alertevery ... summary"

-added xedia-ipsec-tunnel.monitor to monitor site-to-site ipsec tunnels
 on a Xedia AP450 router.

-silkworm.monitor recognizes different brocade OEM'd fcal switches,
 ignores "absent" sensors, and has a work-around for the braindead
 behavior of swFCPortAdmStatus to detect offline ports.

-fix to msql-mysql.monitor to allow --port to override default port.
 submitted by Adrian Phillips 

-stdout and stderr now can be sent to a file by specifying a filename
 in the variable "monerrfile".
 submitted by Ed Ravin 

-updated dns.monitor to output only the failed hosts on the
 summary line.

 "test config" fix,

 new authentication directives  "!" and "AUTH_ANY". "AUTH_ANY",

 check and warnings for hostgroups which are defined but never used

 more descriptive error when m4 is not found

 removed second definition of disen_host and load_stat

 "alertafter timeval" patch, alerts for period will only be called if
  the service has been in a failure state for more than the length
  of time desribed by the interval, regardless of the number of failures
  noticed within that interval.

 submitted by Andrew Ryan 

-more verbose error when bind(2) failure

 tyop fixes to mon.1

 updated COPYRIGHT

 mon.1 is now mon.8, and references to mon.1 changed accordingly

 update to mon.d/Makefile to use $CFLAGS and $LDFLAGS

 silence some warnings in rpc.monitor.c

 add /usr/local/lib to standard search paths for alert.d and mon.d,
   and updated mon.8

 make monshow run under taint mode, fixes view directory to match the docs

 default server for moncmd and monshow is now localhost

 http.monitor accepts a 302 status (moved temporarily)

 fixed --auth in monshow

 reboot.monitor now uses $MON_STATEDIR as the default state directory,
   and "reboot.monitor" (not "state") as default state file.


 update to monshow.1

 submitted by Roderick Schertler 

-fix to pop3.monitor to produce more verbose errors

 fix to reboot.monitor to add --verbose option

 submitted by Ed Ravin 

-qpage.alert accepts "-v" option for verbose

 smtp.monitor has increased verbosity of failure details

 submitted by Steve Siirila 

-re-wrote Steve Siirila's mon.monitor to use Mon::Client and put it
 in mon.d

-patch to do proper syslog handling on openbsd,

 MON_DEPEND_STATUS env variable passed to monitors

 submitted by Mark D. Nagel 

-added "failure_interval" functionality. i actually re-wrote
 the patch to make it a bit more proper, and renamed the
 parameter from "alertintervalcheck" to "failure_interval"
 for clarity.

 submitted by CHASSERIAU JeanLuc 

-netappfree.monitor changes
 Allows the monitor to give more verbose error messages which
 will handle multiple volumes. Instead of reporting:
 "1.0GB free on "
 it will now say:
 "1.0GB free on :/vol/"
 Fixes a bug where multiple alerts from a single filer would cause
 multiple entries in the summary line.  Allows the monitor to handle
 the case where the NetApp MIB isn't available to the script.

 added na_quota.monitor. trockij made some small changes to it
 so that it will allow disable and enable to work.

 submitted by Theo Van Dinter 


Differences between Mon-0.10 and 0.11:
Sun Oct 29 12:16:43 PST 2000

 -incorporated Andrew Ryan's "test config" patch

Changes between mon-0.38.20 and mon-0.38.21
Sun Jan 14 11:55:06 PST 2001

-merged in Andrew Ryan's mon.testconfig.patch to enhance error
 detection and reporting of config file errors. a new client
 command "test config" loads and parses a new config file w/o
 committing it, and returns error conditions found.

-added foundry-chassis.monitor, detects PSU failures on Foundry
 chassis devices.

-update for up_rtt.monitor and added http_tp.monitor from Jon Meek.

-fixed OS detection, patch supplied by Roderick Schertler 

-tiny patch to freespace.monitor which lets the user specify a min % free,
 submitted by Christian Lademann 

-http.monitor now accepts 401 responses as success, a tweak from
 Tim Small 

-documentation correction from Chris Snell 

-added cpqhealth.monitor to which detects PSU/fan/temp problems by
 querying the Compaq Insight manager agent on Presario systems

-save sum and dtl into last_summary and last_detail from traps, bug
 reported by Jan Krivonoska 

-patch to correct trap decoding problem, submitted by
 Ramon Buckland 

-a trap timeout now clears the value of last_detail

-dtlog is now written to upon reception of an "ok" trap

-patch from Gilles Lamiral  which adds
 accuracy to scheduler's synchronous operation. this should help
 keep rrdmon happy.

-added silkworm.monitor to test the operational status of Brocade
 Silkworm FCAL switches. it should detect port, fan, psu, and temperature

-fix to http.monitor from Andrew Ryan  which
 prints the HTTP response header even if a timeout was encountered.
 also fixed another bug w/regards to timeout handling. i applied
 this fix to the following monitors:

-http.monitor will allow you to supply a user agent string of your
 own via "-a useragent". also "-o" will omit HTTP headers from
 properly working hosts (Andrew Ryan )


Changes between mon-0.38.19 and mon-0.38.20
Sat Aug 26 13:29:45 PDT 2000

-updated some docs

-http.monitor checks for 401 status code

-fixed the buggered 0.38.19 release. damn you, cvs, damn you.


Changes between mon-0.38.18 and mon-0.38.19
Sun Aug 20 14:28:23 PDT 2000

-fixed exclude_hosts (again) and tested and tested and tested
 and it works

-patch from andrew ryan to add checkauth command

-included phttp.monitor from Gilles Lamiral 

-changed some wording in INSTALL

-first stage of new config buffering

-readhistoricfile now clears out last_alerts before reading it in

-added -t TRAPPORT cmdline arg

-merged patch from Andrew Ryan to support multiple authtypes,
 including PAM support. Also fixed a bug when the user is listed in but not in the userfile.

-updated documentation of mon.1 to include PAM authentication.

-removed non-portable sockaddr pack statements from monitors.

-CVS has pissed me off to no end with its anomalies, so I did
 a sensible thing and converted the repository to prcs. prcs seems to
 be simple, easy to understand, not quirky, and good enough. So, if
 you notice that the ID version numbers in the sources have changed,
 this is why.

-removed mon.cgi, and replaced it with a README


Release mon-0.38.18
Changes between mon-0.38.17 and mon-0.38.18
Sat Mar  4 11:24:34 PST 2000

-http.monitor accepts 200 and 302

-monshow changes, mostly detail output

-"list opstatus" command shows more data:

-fixed exclude_hosts


Release mon-0.38.17 and Mon-0.8.
Changes between mon-0.38.16 and mon-0.38.17
Sun Feb 27 20:18:46 PST 2000

-added "SELF:" for "depend" variable. When the config file
 is parsed, SELF: expands into "currentwatch:".

-fixed some errors in mon.1

-added exclude_hosts

-added exclude_period

-removed duplicate parsing in read_cf

-"list opstatus" will now accept a list of "group,service" pairs if you
 don't want to list every single group and service.

-documented MON_LOGDIR and MON_STATEDIR in mon.1

-changed how args are split in client_command

-more enhancements to monshow, esp. config options and "view" support.
 read the man page for the details. "views" are meant to show
 a subset of the mon opstatus, and be configurable by the clients.
 for example, each department can get their own view of the systems
 and services which they care to monitor instead of seeing the
 entire list of services monitored by the server.

-added protid client command, and store PROT_VERSION as an integer
 for simple comparison.


Released mon-0.38.16 and Mon-0.7 client Perl module.
Changes between mon-0.38.15 and mon-0.38.16
Sun Feb  6 16:45:55 PST 2000

-monshow now properly displays the "last check" column in seconds,
 and it also displays the description, and you can click on services
 to get details. acknlowledged failures are indicated.

-rewrote cf-to-hosts to support continuation lines

-fixed some documentation

-upalerts work with traps now, thanks to Jim Farrell 

-savestate now produces an error if called w/o arguments

-a patch set submitted by Andreas J. Koenig 
 that helps with some of the documentation

-silly "list pids" output fixed so that the output doesn't have
 lines beginning with numbers, which confuses Mon::Client. submitted
 by David Waitzman 

-fixed problem with acking non-failed services

-config var that allows specification of syslog facility to use

-detail about how "use snmp" is parsed. it's now a variable in the config
 file, and it still doesn't really do anything.

-historicfile is re-read upon server reset.

-catching a HUP in the I/O event look should no longer produce
 the "error trying to recv a trap" message in syslog.

-new config option "startupalerts_on_reset"

-new client command "list dtlog" submitted by Martha H Greenberg 


Released mon-0.38.15 and accompanying Mon-0.6 client Perl module.
Changes between mon-0.38.14 and mon-0.38.15
Sun Nov 14 11:20:23 PST 1999

-Re-wrote dependency code, and fixed the "no upalerts with dependencies"

-list opstatus output now includes a new variable called "depstatus"

-Documented the "alertafter" behavior if only 1 argument
 is supplied.

-Fixed a bug in the arg processing of tcp.monitor, submitted by
 Phillip Pollard .

-Disabling hosts which do not exist now produces an error

-Giving an invalid disable command now produces an error

-Added "list deps" command.

-If config file ends in .m4, process it with m4.

-monshow now shows --deps

-trap.alert uses opstatus found in MON_OPSTATUS or -o, and
 correctly reports it using "spc=".

-fixed problem where ack'ing a non-existent service is not
 complained about, reported by

-"use strict"-ified the server.

-monshow now does CGI && command-line, opstatus.cgi is deprecated
 see etc/example.monshowrc

-ldap.monitor now uses Net::LDAP

-summary output of successes is saved in _last_summary

-client output is hex-escaped, and received traps are un-escaped.
 install the Mon-0.6 perl client for this to work properly, since it
 includes the appropriate changes.

-renamed "reset" function. This was a BIG booboo and it was causing
 a core dump once in a while. "reset" is a perl built-in, which I
 didn't realize :(

-tags in traps are unquoted and unescaped in handle_trap. Mon::Client
 was changed to quote and escape all of them.

-added "numalerts" per-period variable and documented it. it controls
 the number of alerts sent for a failure

-added "comp_alerts" per-period variable and documented it. this var
 stops upalerts from being sent w/o a complementary "down" alert

-it is not possible to specify the binding address for server and trap
 ports. see the man page for details.

-fixed some signal handling and terminal input in moncmd. patch provided

-patch from to correct error reporting in

-long lines may be continued by trailing them with a backslash. read the man
 page for more info.

-added "alerts_sent" to opstatus output

23-Aug-1999 21:30

Released 0.38.14. This is probably considered the most stable release. There are some known bugs, but at least they are known and documented :)

Also, I've updated some code in the "contrib" directory.

Changes between mon-0.38.13 and mon-0.38.14
Mon Aug 23 10:48:42 PDT 1999

-Some clarification in INSTALL procedure.

-Removed old patch that attempted to fix the "no upalerts with deps"

-Added recursion limit for deps, and the "dep_recur_limit"
 config parameter in the config file.

-Some changes to "monitor .* ;;" parsing behavior.

-telnet.monitor now uses Net::Telnet, which is more efficient than
 forking a copy of tcp_scan.

-freespace.monitor uses the newly renamed Filesys::DiskSpace, which
 used to be File::Df.

-added asyncreboot.monitor, which uses the UCD SNMP asynchronous API
 to get the uptime of a bunch of devices in parallel, similar to
 fping. This requires ucd-snmp-3.6.3 or greater and SNMP-1.8 or greater.

-Ditch stderr in fping.monitor, submitted by

-ftp.monitor now sends "quit\r\n"

-Dependency bug fixed re: $dlastChecked, reported by

-Commented out some spurious output in dns.monitor, as submitted by

-Tiny fix to mon.cgi from Matthew Price 

-Fix to trap.alert to make it actually work w/o complaining about
 "undefined type".

-Fix to opstatus.cgi for refresh, submitted by, bug
 ID 16.

-Patch from Petter Reinholdtsen  to add debug output
 to nntp.monitor, and -g to specify the newsgroup to test.

-Re-wrote tcp.monitor to not require tcp_scan. No more dependencies on
 "Satan" software, since fping is available separately.

-Virtual host support in http.monitor, submitted by
 Neale Pickett 


mon-0.38.14 is nearly ready, and its full of tiny bug fixes and a couple of new monitors. I still need to investigate some problems with dependencies that people have reported.

Updated the main page to point to a mirror.


Released mon-0.38.13 and accompanying Perl module Mon-0.5.tar.gz.
Changes between mon-0.38.12 and mon-0.38.13
Sun Jun 13 11:18:16 PDT 1999

-Monitors and alerts are now passed ENV variables MON_STATEDIR and MON_LOGDIR.

-Fixes and tuning to opstatus.cgi.

-monstatus has been removed. Replacement is monshow.

-util/cf-to-hosts accepts -M flag to pre-process with m4.

-Fixed some monshow output when service has not yet been tested.

-Some adjustments to the monshow man page.

-Forked monitors now close server sockets before execing the monitor.
 Bug ID 16 submitted by

-Bug re: "time" file in output of monshow.

-Some minor code cleanups.

-ping.monitor now recognizes netbsd.

-mon.cgi uses Mon::Client, but not all the functionality has been
 converted to this interface, namely the "disable" and "reset" features.


CareTracker is back online and ready to accept bug reports and feature suggestions for mon. See the bug tracking system for more details.


Released 0.38.12.
Changes between mon-0.38.11 and mon-0.38.12

-Fixed "list descriptions" bug submitted by Vad Adamluk 

-Added "last_check" and "monitor" output to client list
 opstatus. opstatus.cgi uses this.  Only works for 0.38.* protocol.

-opstatus.cgi now uses Mon::Client, and some bug fixes and enhancements.

-Removed "bind" from ftp.monitor http.monitor http_t.monitor imap.monitor
 nntp.monitor pop3.monitor smtp.monitor. It was unnecessary.


Released 0.38.11, which fixes a bug which would cause some alerts to not be invoked.


I just put 0.38.10 into the "BETA" directory on Thanks for the feedback, and keep testing :)
$Id: news.m4,v 1.19 2004/10/05 13:26:15 trockij Exp $

Changes between mon-0.38.9 and mon-0.38.10

-Fixed a bug where call_alert didn't set _last_alert correctly,
 thus causing things like alertevery to not work properly.
-Small bug fix in handle_trap_timeout

-Removed some debugging junk for dtlogging

-A few code cleanups here and there

-Fixed @groupargs problem in call_alert


Released mon-0.38.9 and Mon-0.4. This is a feature freeze, and only bugfixes will be included into the 0.38.* branch. New features will be added into the new 0.39.* branch.

Changes between mon-0.38.8 and mon-0.38.9

-Removed %var% substitution in favor of -M, which pre-processes the config
 file with m4. Macro expansion should be handled by software whose sole
 purpose is to perform macro expansion, hence m4.

-Added an "example.m4" in the etc/ directory.

-Added "fail" trap.

-Pass _op_status value to alerts via env variable MON_OPSTATUS.

-Updated file.alert to log MON_OPSTATUS.

-Fixed bug in client buffer handling where a blank line submitted by
 the client would prevent all future commands from being processed.

-The server no longer disconnects the client on an invalid command.

-Added "--disabled" and "--state" commands to monshow. Showing disabled
 hosts is no longer the default. The defaults can be set in ~/.monshowrc.
 This requires the latest Perl module (Mon-0.4). Also added "--old" option.

-Added man page for monshow.

-Updated some docs in mon.1

-Don't complain if userfile does not exist and the authtype is not userfile.

-Patched in Gilles' historicfile stuff, and documented it in mon.1, and fixed
 some bugs.

-Alerts are no longer called with -l parameter. It's never been documented,
 and no alerts use it, so I'm ditching it.

-version command returns a value like "0.38.9" rather than a float.

-Separated alert calling function from the function which determines
 if an alert should be called.

-Alerts are now forked with a separate environment than the parent.

-"test alert|upalert|startupalert" client command added, which will immediately
 call an alert for testing purposes. Updated the docs for moncmd to reflect
 this command.


Released mon-0.38.8 and Mon-0.3.
Changes between mon-0.38pre7 and $Name:  $

-mon is now kept under CVS control (exclusively to maintain my own
 personal sanity). The Perl module is distributed as a separate file now,
 so that it can find its home in the CPAN module directory.

-Documented "traptimeout" and "trapduration", and cleaned up
 some docs in mon.1.

-Included upalerts and startupalerts in gen_scriptdir_hash()

-Lots of code cleanups in read_cf.

-alertafter now has two forms, one just like before, and
 one with a single integer argument which alerts after some number of
 consecutive failures.

-I should have done this long ago. %watch now looks like this:
 instead of
 and $service is the text of the service, not an integer.

-Lots of code cleanups regarding global variables which are
 altered by read_cf.

-Fixed "list successes" and "list failures" command.

-Added "clear timers" command which clears the timers
 for things like alertafter and alertevery and such.

-netappfree.monitor has some MIB reading changes which fixes the
 core dumping problem.

-Added set_op_status.

-Removed some debug cruft from check_depend.

-Fix to $fhandles{"$group/$service"}.

-Updated "-h" output to be accurate.

-Test -f to see if an alert or monitor exists before trying to
 exec it.

-gilles reported a problem with the servertime output, which was fixed.

-"interval" initialization was supplying a default interval,
 which isn't cool because it didn't allow you to have a service w/o an
 interval for use as a trap sink. The new default is undef.

-I started work on muxpect, which is sort of a combination of the mux
 capabilities of fping and doing Expect-style chat sequences over TCP
 sockets. It is meant to replace those millions of TCP-based monitors
 in the mon.d directory with a less CPU-intensive version.

-Some alert decision code moved from proc_cleanup to do_alert where
 it belongs.

-Changed some trap code.


Released 0.38pre7.

Changes between mon-0.38pre6 and mon-0.38pre7

-Added "basedir=" and -b, and "cfbasedir=" and -B

-use usleep.

-Added startupalerts which are called upon startup.

-alerts called with env variable MON_ALERTTYPE

-logdir, added downtime logging via dtlogging/dtlogfile

-Periods can now be specified using a LABEL: tag (similar to
 labeling blocks and loops in Perl). This allows multiple periods with
 the same period value. This feature is useful because the "alertafter"
 and "alertevery" counters are kept on a per-period basis.

-Fixed process.monitor to use the new values for the process table
 in the UCD MIB.

-Fixed a problem with reload and path/file expansion.

-Alerts are now called with MON_RETVAL set to the exit value of the

-Added trap.alert. Not quite documented.

-Added version command to Mon::Client, thanks to


Release mon-0.38pre6. Please test. Includes dialin.monitor and some bug fixes. See CHANGES for details.


Added a couple of more things to the FAQ.


Added a couple of things to the FAQ.


Released mon-0.38pre5.

Changes between mon-0.38pre4 and mon-0.38pre5

-Fixed bug #3, problem with %alias

-Fixed bug #4, problem with unpacking a socket which wasn't
 really a socket yet (out of order assignments)

-Renamed Client to Mon-0.01 to follow the Perl module naming
 convention better, and to make room for things like logging
 modules and such.

-Implemented more protocol commands to Mon::Client. Only 4 left...

-Adjusted nntp.monitor to allow for some protocol / implementation
 inconsistencies. The commands now strictly follow RFC977.

-Fixed problem with 0.38 protocol and Mon::Client.

-Added multiple authentication types, including getpwnam, shadow,
 and userfile. Read the man page for details.

-Added "version" client command to identify the protocol version.

-Added host + user authentication to traps. Configuration is done
 in No documentation yet.

-Added simple downtime logging, and documented it in mon.1.

-Tiny change to reboot.monitor.

-Added Mon::SNMP module to decode SNMP traps.

-Added pod to Mon::Client. I think it took as long to
 code it as it did to document it.


Some web page updates. 0.38pre5 is coming along, and should be released this weekend. I think I've nailed the "upalerts with dependencies" problem.


New bug tracking for mon, from the guys at LinuxCare. I figured I'd take advantage of someone else doing work for us :)


Released mon-0.38pre4.

Changes between mon-0.38pre3 and mon-0.38pre4

-Added fixes from Chris Adams  that correct some $ALERTDIR
 and monitor argument problems.

-Fixes to monstatus from brian moore.

-Another fix to get the "exit=n" stuff working with alerts again, broken
 because of ALERTHASH code.

-Wrote "monshow" in the clients directory, which is a per-user configurable
 command-line client.

-Mon::Client perl module included to help simplify writing clients. It
 doesn't implement a number of commands yet. Look at the end of
 to see which commands have been implemented and which have not been.

 "monshow" is in the clients directory, and it is an example of how to
 use the Mon::Client module.

 Mon::Client also needs POD documentation.


Back from the USENIX LISA conference Boston, and I did a little work on mon while on the plane and in the room, so here is 0.38pre3.
Changes between mon-0.38pre2 and mon-0.38pre3

-Added "ack" client command, which will acknowledge a service failure and
 surppress all further alerts for that service while it continues to fail.
 See the moncmd man page for details. You can "ack" with a string of

-alertdir and scriptdir can now contain multiple colon-separated
 paths. This feature is useful for keeping site-specific monitors
 and alerts in their own directory which is separate from the monitors
 which are distributed with mon itself. Updated the docs for this.

 A hash is generated after each time the configuration is read
 which holds the location of where each monitor and alert script can
 be found. Errors are reported via syslog, so pay attention to them.

-Some "alias" code tweaks. Gilles, does it work??? If no, send
 the patch.

-Poked a little with the trap code. The trap  now contains
 a "spc" tag which specifies the specific type of trap, like maybe
 SNMPv1 or SNMPv2 or "mon 0.38". 

-An update to rpc.monitor to let it build under Solaris. It can now
 also check to see if an arbitrary RPC program number is registered.
 Documentation updates.

-Dependencies are still broken, because I haven't spent any time at
 all looking at them.


Release of *ALPHA* 0.38pre2 (notice the numbering change).

Changes between mon-0.38pre1 and mon-0.38pre2

-Some fixed from brian moore to correct client hangups

-netappfree.monitor changes, including --list option to
 list the filesystems on the filers for help in building a config file.

-Trap handling changes, including packet . More provisions
 for direct SNMP handling. I might add direct provisions for mon to take
 SNMP traps directly. UCD SNMP trap handling callback mechanism doesn't
 fit into mon very well.

-"list opstatus" output is now different

-Time::HiRes is now required. The trick is that handle_io() wants
 to spend $SLEEPINT handling I/O from clients. Some OSs allow select(2)
 to return the time remaining, which we want because if select returned
 in say, 0.2 seconds then we want to call select with a timeout of 0.8
 seconds so that we get the full second of waiting for I/O. Some OSs
 do *not* return the time remaining from a select call, and time(2)
 doesn't return sub-second resolution, so we need gettimeofday(2) to
 figure out how long select spent waiting. I guess the whole point here
 is to try to handle traps as soon as they come in.

-Fixed @last_failures discrepancy with traps.

-Added Gilles' alias record stuff to config file

-Included Jon Meek's up_rtt.monitor which checks the availability of
 hosts and logs some statistics, like min/mean/max round trip times.
 Requires Time::HiRes and Statistics::Descriptive.


Release of *ALPHA* 0.pre38a. This release is primarily meant for people who can help fix bugs, so I wouldn't try running expect this to release to work well for everyone.


Added link to Tim Potter's archive of the mailing list. Much better than the half-assed archive that I threw together :)

mon-0.38 should be released within a week or so. It includes Jing Tan's dependency code, asynchronous event handling (mon is now distributed), a few new client commands, some bug fixes, some new monitors, new disabling behavior (you can disable a group/service/host for a particular duration, and mon will re-enable it when the time comes).


Updated the FAQ to point to CPAN and some other places, and added a question and answer submitted by Rikhardur Egilsson .


Released 0.37l. Lots of changes.

$Id: news.m4,v 1.19 2004/10/05 13:26:15 trockij Exp $

Changes between mon-0.37k and mon-0.37l
-Config parser change from Michael Griffith  that
 complains when "alertafter" will never trigger an alert.

-Added "savestate" and "loadstate". Currently these only save
 and load the state of things disabled.

-The server now can authenticate clients using a simple
 configuration file which can restrict certain users to
 using only some (or all) commands. "moncmd" was updated
 to support this feature.

-Addition of "upalerts" which may be called when a service
 changes state from failure to success. "upalerts" can be
 controlled by the "upalertafter" parameter.

-"alertevery" now ignores detailed output when it decides
 whether or not to send an alert. Patch submitted by
 brian moore .

-"hostgroup and hyphen" patch. This simple patch will allow
 hyphens and periods in hostgroup tags.

-Multiline output fixes in smtp.monitor 

-Now monitors are not called when no host arguments are supplied. This
 can be overridden with the per-service "allow_empty_group" option.

-A fix to ftp.monitor by Tiago Severina  which allows
 for multiple 220 lines in the greeting from the FTP server.

-Added snpp.alert, contributed by Mike Dorman .
 This requires the SNPP Perl module.

-Added ldap.monitor, contributed by David Eckelkamp .
 This requires the Net::LDAPapi module.

-Added dns.monitor, contributed by David Eckelkamp .
 This requires the Net::DNS module.

-Monitor definitions can now include shell-like quoted words, as defined by
 the Text::ParseWords module (included with perl5). e.g.:
        monitor something.monitor -f "this is an argument" -a arg

-"allow_empty_group" is a new per-service option. If set, monitors will
 still be run even if all hosts in the applicable hostgroup have been
 disabled. The default is that allow_empty_group is not set.

-Monitors are now forked with stdin connected to /dev/null.

-Added "stop" and "start" commands which let make the server cease from
 scheduling any monitors. While stopped, clients can still be handled. The
 server may be started[sic] in "stopped" mode with -S. There is now a
 "reset stopped", which is an atomic version of "reset" and "stop". This
 is useful if you want to re-disable things immediately after a reset,
 so there will be no race conditions after the reset and before you
 disable things.

 opstatus.cgi now also reports the state of the scheduler.

-Updated documentation for monitors, the main "mon" manual,
 and the "moncmd" manual.

-Fixed a few problems in handle_client that had to do with shutting
 the server down.