104 lines
2.9 KiB
Groff
104 lines
2.9 KiB
Groff
|
.TH moncheck 1
|
||
|
.SH NAME
|
||
|
moncheck \- runs shell commands as checks
|
||
|
.SH SYNOPSIS
|
||
|
.B moncheck
|
||
|
[\fB\-\-config\fR \fIPATH\fR]
|
||
|
.SH DESCRIPTION
|
||
|
.B moncheck
|
||
|
runs commands as checks and reports the result back into the database.
|
||
|
.SH OPTIONS
|
||
|
.TP
|
||
|
.BR \-\-config =\fIPATH\fR
|
||
|
Use the specified config file.
|
||
|
.SH CONFIGURATION
|
||
|
The configuration file must be formatted in json.
|
||
|
Known keys and their effect are as follows:
|
||
|
.TP
|
||
|
.BR checker_id
|
||
|
The \fIchecker_id\fR is required and is used to look up which checks should be
|
||
|
run by the monzero instance.
|
||
|
It is okay to run multiple instances with the same \fIchecker_id\fR, as the scheduling
|
||
|
and locking is done in the database.
|
||
|
|
||
|
.TP
|
||
|
.BR db
|
||
|
Set the database connection parameters to the postgres database. When using a
|
||
|
separate user, the user must have \fBwrite\fR permissions on the tables \fIactive_checks\fR,
|
||
|
\fInotifications\fR.
|
||
|
|
||
|
\fBread\fR permissions are required on the tables \fIchecks_notify\fR, \fImapping_level\fR.
|
||
|
|
||
|
Options to use are \fIuser\fR, \fIdbname\fR, \fIhost\fR, \fIport\fR, \fIpassword\fR.
|
||
|
|
||
|
.TP
|
||
|
.BR log
|
||
|
The log output per default is going to stderr in a human readable way.
|
||
|
But it can be adjusted to via \fIoutput\fR to write to a file or one of \fIstdout\fR
|
||
|
or \fIstderr\fR.
|
||
|
|
||
|
Using \fIlevel\fR with either one of \fIdebug\fR, \fIinfo\fR, \fIwarn\fR, \fIerror\fR
|
||
|
it is possible to limit the output.
|
||
|
|
||
|
By adjusting the \fIformat\fR the output can be changed from \fItext\fR to \fIjson\fR
|
||
|
to get machine readable log output.
|
||
|
|
||
|
.BR example
|
||
|
|
||
|
.nf
|
||
|
.RS
|
||
|
{
|
||
|
"format": "text",
|
||
|
"level": "info",
|
||
|
"output": "stderr"
|
||
|
}
|
||
|
.RE
|
||
|
.fi
|
||
|
|
||
|
.TP
|
||
|
.BR path " - " \fRdefault: []
|
||
|
Set a number of lookup paths that can be used to lookup check commands on the
|
||
|
filesystem.
|
||
|
|
||
|
.TP
|
||
|
.BR timeout " - " \fRdefault: 30s
|
||
|
The timeout decides the maximum time limit a command is allowed to run. When choosing
|
||
|
longer timeouts be aware that timeouts can lead to more waiting checks.
|
||
|
|
||
|
.TP
|
||
|
.BR wait " - " \fRdefault: 30s
|
||
|
The wait duration sets the time to wait between two checks and can be used to
|
||
|
lower database traffic or used CPU.
|
||
|
|
||
|
.TP
|
||
|
.BR workers " - " \fRdefault: 25
|
||
|
Set the number of workers that run check commands on parallel. The more parallel
|
||
|
workers there are, the higher the lock contention on the database will become,
|
||
|
but at the same time long running checks will have less of an impact on the
|
||
|
number of waiting checks.
|
||
|
|
||
|
Tune this value according to your available resources, foremost CPU cores.
|
||
|
|
||
|
.SH CHECK COMMAND
|
||
|
|
||
|
A \fIcheck command\fR has to implement the nagios API of a check command.
|
||
|
|
||
|
1. It must return a message on stdout
|
||
|
|
||
|
2. It must have an exit code to show the severity level
|
||
|
|
||
|
.RS
|
||
|
0 - check was a success
|
||
|
|
||
|
1 - the check ended in an error
|
||
|
|
||
|
2 - the check ended in a warning
|
||
|
|
||
|
3 - the check is in an unknown state
|
||
|
.RE
|
||
|
|
||
|
If a check takes longer it can be catched by the timeout. It should be taken care
|
||
|
though, that checks don't take too much time as the check interval only starts
|
||
|
after the check ended, which can lead to less checks done in a time period than
|
||
|
expected.
|