1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
|
.TH moncheck 1
.SH NAME
moncheck \- runs shell commands as checks
.SH SYNOPSIS
.B moncheck
[\fB\-\-config\fR \fIPATH\fR]
.SH DESCRIPTION
.B moncheck
runs commands as checks and reports the result back into the database.
.SH OPTIONS
.TP
.BR \-\-config =\fIPATH\fR
Use the specified config file.
.SH CONFIGURATION
The configuration file must be formatted in json.
Known keys and their effect are as follows:
.TP
.BR checker_id
The \fIchecker_id\fR is required and is used to look up which checks should be
run by the monzero instance.
It is okay to run multiple instances with the same \fIchecker_id\fR, as the scheduling
and locking is done in the database.
.TP
.BR db
Set the database connection parameters to the postgres database. When using a
separate user, the user must have \fBwrite\fR permissions on the tables \fIactive_checks\fR,
\fInotifications\fR.
\fBread\fR permissions are required on the tables \fIchecks_notify\fR, \fImapping_level\fR.
Options to use are \fIuser\fR, \fIdbname\fR, \fIhost\fR, \fIport\fR, \fIpassword\fR.
.TP
.BR log
The log output per default is going to stderr in a human readable way.
But it can be adjusted to via \fIoutput\fR to write to a file or one of \fIstdout\fR
or \fIstderr\fR.
Using \fIlevel\fR with either one of \fIdebug\fR, \fIinfo\fR, \fIwarn\fR, \fIerror\fR
it is possible to limit the output.
By adjusting the \fIformat\fR the output can be changed from \fItext\fR to \fIjson\fR
to get machine readable log output.
.BR example
.nf
.RS
{
"format": "text",
"level": "info",
"output": "stderr"
}
.RE
.fi
.TP
.BR path " - " \fRdefault: []
Set a number of lookup paths that can be used to lookup check commands on the
filesystem.
.TP
.BR timeout " - " \fRdefault: 30s
The timeout decides the maximum time limit a command is allowed to run. When choosing
longer timeouts be aware that timeouts can lead to more waiting checks.
.TP
.BR wait " - " \fRdefault: 30s
The wait duration sets the time to wait between two checks and can be used to
lower database traffic or used CPU.
.TP
.BR workers " - " \fRdefault: 25
Set the number of workers that run check commands on parallel. The more parallel
workers there are, the higher the lock contention on the database will become,
but at the same time long running checks will have less of an impact on the
number of waiting checks.
Tune this value according to your available resources, foremost CPU cores.
.SH CHECK COMMAND
A \fIcheck command\fR has to implement the nagios API of a check command.
1. It must return a message on stdout
2. It must have an exit code to show the severity level
.RS
0 - check was a success
1 - the check ended in an error
2 - the check ended in a warning
3 - the check is in an unknown state
.RE
If a check takes longer it can be catched by the timeout. It should be taken care
though, that checks don't take too much time as the check interval only starts
after the check ended, which can lead to less checks done in a time period than
expected.
|