Posts Tagged ‘ Nagios

Nagios 配置篇

nagios.cfg

nagios_group=nagcmd

Nagios 配置文件详解

作者:Joshua

注意

  • 当创建或编辑配置文件时,要遵守如下要求:
  • 以符号’#'开头的行将视为注释不做处理;
  • 变量必须是新起的一行 - 变量之前不能有空格符;
  • 变量名是大小写敏感的;

配置文件列表:
主配置文件

主配置文件[--prefix/nagios/etc/nagios.cfg]:

##############################################################################
#
# NAGIOS.CFG - Sample Main Config File for Nagios 3.0.6
#
# Read the documentation for more information on this configuration
# file.  I've provided some comments here, but things may not be so
# clear without further explanation.
#
# Last Modified: 10-15-2008
#
##############################################################################
 
# LOG FILE
# This is the main log file where service and host events are logged
# for historical purposes.  This should be the first option specified
# in the config file!!!
# 这个变量用于设定Nagios在何处创建其日志文件。
# 它应该是你主配置文件里面的第一个变量,当Nagios找到你配置文件并发现配置里有错误时会向该文件中写入错误信息。
# 如果你使能了日志回滚,Nagios将在每小时、每天、每周或每月对日志进行回滚。
 
log_file=/usr/local/nagios/var/nagios.log
 
# OBJECT CONFIGURATION FILE(S)
# 对象的配置文件
# These are the object configuration files in which you define hosts,
# host groups, contacts, contact groups, services, etc.
# You can split your object definitions across several config files
# if you wish (as shown below), or keep them all in a single config file.
# 该变量用于指定一个包含有将用于Nagios监控对象的对象配置文件。
# 对象配置文件中包括有主机、主机组、联系人、联系人组、服务、命令等等对象的定义。
# 配置信息可以切分为多个文件并且用cfg_file=语句来指向每个待处理的配置文件
 
# You can specify individual object config files as shown below:
# 您可以指定单个对象的配置文件, 如下所示:
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
 
# Definitions for monitoring the local (Linux) host
# 定义监测本地( Linux )主机
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
 
# Definitions for monitoring a Windows machine
# 定义监测( windows )主机
#cfg_file=/usr/local/nagios/etc/objects/windows.cfg
 
# Definitions for monitoring a router/switch
# 定义监测路由器或交换机
#cfg_file=/usr/local/nagios/etc/objects/switch.cfg
 
# Definitions for monitoring a network printer
# 定义监测打印机
#cfg_file=/usr/local/nagios/etc/objects/printer.cfg
 
# You can also tell Nagios to process all config files (with a .cfg
# extension) in a particular directory by using the cfg_dir
# directive as shown below:
# 您也可以告诉Nagios处理所有配置文件(带有.cfg扩展名)在特定的目录使用cfg_dir指令如下所示:
 
#cfg_dir=/usr/local/nagios/etc/servers
#cfg_dir=/usr/local/nagios/etc/printers
#cfg_dir=/usr/local/nagios/etc/switches
#cfg_dir=/usr/local/nagios/etc/routers

配置文件里的变量:

日志文件

格式:    log_file=<file_name>
样例:    log_file=/usr/local/nagios/var/nagios.log

说明:

这个变量用于设定Nagios在何处创建其日志文件。它应该是你主配置文件里面的第一个变量,当Nagios找到你配置文件并发现配置里有错误时会向该文件中写入错误信息。如果你使能了日志回滚,Nagios将在每小时、每天、每周或每月对日志进行回滚。

对象配置文件

格式:    cfg_file=<file_name>
样例:
cfg_file=/usr/local/nagios/etc/hosts.cfg
cfg_file=/usr/local/nagios/etc/services.cfg
cfg_file=/usr/local/nagios/etc/commands.cfg

说明:

该变量用于指定一个包含有将用于Nagios监控对象的对象配置文件。对象配置文件中包括有主机、主机组、联系人、联系人组、服务、命令等等对象的定义。配置信息可以切分为多个文件并且用cfg_file=语句来指向每个待处理的配置文件.

对象配置目录:

格式:    cfg_dir=<directory_name>
样例:
cfg_dir=/usr/local/nagios/etc/commands
cfg_dir=/usr/local/nagios/etc/services
cfg_dir=/usr/local/nagios/etc/hosts

说明:

该变量用于指定一个目录,目录里包含有将用于Nagios监控对象的对象配置文件。所有的在这个目录下的且以.cfg为扩展名的文件将被作为配置文件来处理。另外,Nagios将会递归该目录下的子目录并处理其子目录下的全部配置文件。你可以把配置放入不同的目录并且用cfg_dir=语句来指向每个待处理的目录。

FreeBSD安装Nagios监控系统

通过前篇文章《Linux安装Nagios监控系统, perl-fcgi, nginx》看,Nagios的安装比较简单,即使是用nginx来替代apache。复杂的是设置和配置参数的设定。不过你要放松一点,毕竟我们要搞定它,不是吗?那就开始吧:

1:获得最新的安装包,http://www.nagios.org/download
2:以root身份登录服务器,目前最新的版本是3.0.6:
1)nagios,版本2.5:

wget http://nchc.dl.sourceforge.net/sourceforge/nagios/nagios-3.0.6.tar.gz

2)获得nagios插件,版本1.4.3:

wget http://jaist.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.13.tar.gz

3)获得图库文件:

http://dl.sf.net/nagios/imagepak-base.tar.gz

4)NRPE,版本2.5.2

http://ufpr.dl.sourceforge.net/sourceforge/nagios/nrpe-2.5.2.tar.gz

5)NSCA,版本2.6

http://kent.dl.sourceforge.net/sourceforge/nagios/nsca-2.6.tar.gz

3:切换到root用户:
sudo su

4:解压缩
tar zxvf nagios-2.5.tar.gz

5:建立运行nagios的用户:
adduser nagios

6:建立安装nagios的文件夹,并使这个文件夹的所有者为nagios:nagios
mkdir /usr/local/nagios
chown nagios.nagios /usr/local/nagios

7:确认web服务器的用户
可能会通过web接口执行一些命令,必须确定web服务器以哪个用户运行的,通常为:apache:
grep “^User” /usr/local/apache2/conf/httpd.conf

8:建立命令文件组
这个新的组会包括apache的用户和nagios的用户
pw groupadd nagcmd
pw usermod apache -G nagcmd
pw usermod nagios -G nagcmd
———————————-
cat /etc/group
nagcmd:*:9007:apache,nagios
———————————-

8:运行配置脚本并安装nagios
cd nagios-3.0.6

./configure –prefix=/usr/local/nagios –with-gd-lib=/usr/local/lib –with-gd-inc=/usr/local/include
make all
make install
make install-init
make install-commandmode
make install-config

9:安装nagios-plugins
tar zxvf nagios-plugins-1.4.13.tar.gz
cd nagios-plugins-1.4.13
./configure –prefix=/usr/local/nagios-plugins
make all
make install
安装完成以后在/usr/local/nagios-plugins-plugins会产生一个libexec的目录,将该目录全部移动到/usr/local/nagios目录下即可。
mv /usr/local/nagios-plugins-plugins/libexec/ /usr/local/nagios/

10:imagepak-base.tar.gz的安装
tar –xvzf imagepak-base.tar.gz
解压以后是base目录
mv base/ /usr/local/nagios/share/images/logos/

———————————————————————-
现在开始配置:
———————————————————————-
1:配置web接口
假设你已经运行了apache,如果没有,请参考:

http://localhost/upload/blog.php?do-showone-tid-18.html

vi /usr/local/apache2/conf/httpd.conf
添加如下内容:
引用
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName “Nagios Access”
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user

Alias /nagios /usr/local/nagios/share
Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName “Nagios Access”
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user

修改完毕,保存文件,并重启apache:
/usr/local/apahce2/bin/apachectl restart

2:配置apache的BASIC认证:
生成认证密码:
/usr/local/apache2/bin/htpasswd –c /usr/local/nagios/etc/htpasswd.users nagios nagios
apache接口配置完成。

开始配置nagios:
cd /usr/local/nagios/etc/
在/usr/local/nagios/etc下是nagios的配置模板文件-sample,把.cfg-sample文件全部拷贝成.cfg
例如:cp nagios.cfg-sample nagios.cfg
全部拷贝完成即可.

vi minimal.cfg
注释所有command:
注释的方法是在每一个定义语句前面添加”#“
修改cgi.cfg
修改use_authentication=1为use_authentication=0,即不用验证.不然有一些页面不会显示。

现在检查配置文件是否有语法错误:
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
如果正确,会显示以下结果:
Total Warnings: 0
Total Errors: 0
否则,需要根据提示进行修改配置文件。

配置文件等会再弄。现在启动nagios
/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

为了使nagios异常中断,我们使用daemontools启动:
安装daemontool:
mkdir -p /package
chmod 1755 /package
cd /package
fetch http://cr.yp.to/daemontools/daemontools-0.76.tar.gz
cd admin/daemontools-0.76/
package/install
检查svscan进程是否启动:
ps aux | grep svscan
root 376 0.0 0.0 1636 0 con- IW – 0:00.00 /bin/sh /command/svscanboot
root 411 0.0 0.0 1224 208 con- S 8Jul06 0:42.50 svscan /service

ok,启动正常了。
cd /service
mkdir nagios
chmod 1755 nagios
touch ./run
chmod 755 ./run
vi run
PATH=/usr/local/bin:/usr/bin:/bin
export PATH

exec env – PATH=$PATH \
/usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg

mkdir log
cd log
touch ./run
chmod 755 ./run
vi ./run
#!/bin/sh
exec setuidgid logadmin multilog t s1000000 n100 ./main

mkdir main
chmod 777 main
chown nagios.nagios main
touch status
chown nagios.nagios status

svc -u /service/nagios/
svstat /service/nagios/
root@## ps auxww | grep nagios
root 23276 0.0 0.1 1176 488 ?? I 5:00PM 0:01.71 supervise nagios
nagios 34251 0.0 0.3 2316 1552 ?? S 6:06PM 0:00.10 /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
root@##

ok,现在把nagios服务做成自动启动的服务了。
通过svc命令可以启动或者停止服务。
———————————————————————————
svc opts services
opts is a series of getopt-style options. services consists of any number of arguments, each argument naming a directory used by supervise.

-u: Up. If the service is not running, start it. If the service stops, restart it.
-d: Down. If the service is running, send it a TERM signal and then a CONT signal. After it stops, do not restart it.
-o: Once. If the service is not running, start it. Do not restart it if it stops.
-p: Pause. Send the service a STOP signal.
-c: Continue. Send the service a CONT signal.
-h: Hangup. Send the service a HUP signal.
-a: Alarm. Send the service an ALRM signal.
-i: Interrupt. Send the service an INT signal.
-t: Terminate. Send the service a TERM signal.
-k: Kill. Send the service a KILL signal.
-x: Exit. supervise will exit as soon as the service is down. If you use this option on a stable system, you’re doing something wrong; supervise is designed to run forever.
———————————————————————————
比如:
停止nagios--svc -d /service/nagios/
重启nagios--svc -t /service/nagios/
启动nagios--svc -u /service/nagios/

当然,你也可以使用inited的方式进行:
/usr/local/etc/rc.d/nagios start/stop

好了,反正daemontools很强大,以后慢慢熟悉,转入正题。
现在打开网页:http://localhost/nagios/
一定会让你大吃一惊,呵呵,我的服务器和服务状态都清楚的看到了。
现在我们的nagios中只有一个,那就是它自己,localhost,呵呵,等会我们添加别的主机和主机服务,ok,我们认识一下nagios的庐山真面目:

配置nagios:

1)为主机添加服务
2)添加主机并添加服务
3)停止一个服务
4)删除一台主机和服务
5)查看所有主机的故障
6)查看一台特定的主机状态
7)改变报警的时间间隔
8)改变发现故障的重试次数
9)如何在nagios中使用外部命令

1)为主机添加一个服务
为localhost主机添加qmail服务的监控,方法如下:
vi minimal.cfg
define service{
use generic-service ; Name of service template to use
host_name localhost
service_description qmail_smtp
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_smtp!20%!10%!/
}

可以直接拷贝原有的进行修改,我这个就是拷贝的原有的check_local_disk进行的。
修改host_name,service_description,check_command等

define service{
use generic-service ; Name of service template to use
host_name localhost
service_description qmail_pop3
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_pop!20%!10%!/
}
照猫画虎的进行修改,然后去修改:
vi checkcommands.cfg
#’check_qmail’ command definition
define command{
command_name check_qmail
command_line $USER1$/check_smtp -H 127.0.0.1
}
define command{
command_name check_pop3
command_line $USER1$/check_pop -H 127.0.0.1
}
保存,然后检查配置文件:
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
如果没有错误会显示:
Total Warnings: 0
Total Errors: 0
如果有错误,请根据提示进行错误的修正。
重启nagios
svc -d /service/nagios/ && svc -u /service/nagios/
通过web页面检查nagios的结果:

http://10.5.1.153/nagios/

点击“Service Detail”
会出现:

2)添加主机并添加服务
我们会监控这台主机的负载、磁盘等一些没有通过端口方式启动的服务器状态,以及它的服务,比如:apache、mysql、qmail和ntp等等吧。那么没有端口的nagios直接能监控到吗?答案是不行。所以我们必须在两台主机上安装nrpe,nrpe可以启动5666端口,把检测的信息源源不断的传给监控中心的主机。
ok,我们把apache、mysql、qmail和ntp先加上,这回我们把监控的主机和服务新建一个文件:
cd /usr/local/nagios/etc/
touch 10_5_1_156.cfg
vi nagios.cfg
cfg_file=/usr/local/nagios/etc/10_5_1_156.cfg

vi 10_5_1_156.cfg
定义一个主机:
define host{
use generic-host ; Name of host template to use
host_name test_nrpe
alias client
address 10.5.1.156
check_command check-host-alive
max_check_attempts 1
check_period 24×7
notification_interval 120
notification_period 24×7
notification_options d,r
contact_groups admins
}

定义主机需要检查的服务:
define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description PING
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_ping!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description apache
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_http!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description mysql
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_mysql!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description ntp
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_ntp!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description qmail_smtp
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_smtp!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description qmail_pop3
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_pop!100.0,20%!500.0,60%
}
现在我们象上次一样把
服务也定义完了:

此时是不是多了一个主机和它下面的服务呢?那是肯定的,添加主机和服务可能出现的问题有如下情况:
1:配置参数出现问题,如果你没有检查配置就启动nagios,可能会启动成功,但是显示会不正常;
解决方法:调整配置参数
2:Connection refused
当出现这个问题的时候,我开始以为是ssh的无密码登录没有成功,但是其实我的服务器没有启动该服务造成的,启动服务即可。

但是这些是有端口的服务,没有使用端口的状态任何检测?
使用nrpe,ok,我们现在在服务器上安装nrpe:
一、远程主机的配置
1、安装nrpe与配置
fetch http://ufpr.dl.sourceforge.net/sourceforge/nagios/nrpe-2.5.2.tar.gz
tar zxvf nrpe-2.5.2.tar.gz
cd nrpe-2.5.2
./configure –enable-ssl –enable-command-args
make all
mkdir -p /usr/local/nagios/etc
mkdir /usr/local/nagios/bin
mkdir /usr/local/nagios/libexec
pw addgroup nagios
pw useradd nagios -g nagios -d /usr/local/nagios/ -s /sbin/nologin
chown -R nagios:nagios /usr/local/nagios
cp ./sample-config/nrpe.cfg /usr/local/nagios/etc
cp src/nrpe /usr/local/nagios/bin
2、启动nrpe,端口为5666
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
netstat -ant | grep 5666
tcp4 0 0 *.5666 *.* LISTEN

二、监控服务器上的配置
1、安装nrpe(主要是使用check_nrpe模块)
fetch http://ufpr.dl.sourceforge.net/sourceforge/nagios/nrpe-2.5.2.tar.gz
tar zxvf nrpe-2.5.2.tar.gz
cd nrpe-2.5.2
./configure –enable-ssl –enable-command-args
make all
cp src/check_nrpe /usr/local/nagios/libexec
2、nagios文件的配置
vi checkcommands.cfg
定义check_nrpe命令
# ‘check_nrep’ command definition
define command{
command_name check_nrpe
command_line /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
三、上面我们已经配置了一部分参数,下面是配置的最终结果:
define host{
use generic-host ; Name of host template to use
host_name test_nrpe
alias client
address 10.5.1.156
check_command check-host-alive
max_check_attempts 1
check_period 24×7
notification_interval 120
notification_period 24×7
notification_options d,r
contact_groups admins
}

# ‘check_load’ command definition
define command{
command_name check_load
command_line $USER1$/check_load -w $ARG1$ -c $ARG2$
}

# ‘check_load’ command definition
define command{
command_name check_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$
}
define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description PING
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_ping!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description apache
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_http!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description mysql
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_mysql!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description ntp
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_ntp!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description qmail_smtp
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_smtp!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description qmail_pop3
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_pop!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description test_load
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_load!100.0,20%!500.0,60%
}

define service{
use generic-service ; Name of service template to use
host_name test_nrpe
service_description test_disk
is_volatile 0
check_period 24×7
max_check_attempts 1
normal_check_interval 1
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24×7
check_command check_disk!100.0,20%!500.0,60%
}

四、检查配置参数并重启nagios

9)如何在nagios中使用外部命令
vi /usr/local/nagios/etc/nagios.cfg
check_external_commands=1

mkdir /usr/local/nagios/var/rw
chown nagios.nagcmd /usr/local/nagios/var/rw
chmod u+rw /usr/local/nagios/var/rw
chmod g+rw /usr/local/nagios/var/rw
chmod g+s /usr/local/nagios/var/rw

svc -t /service/nagios/
/usr/local/apache2/bin/apachectl restart

Linux安装Nagios监控系统, perl-fcgi, nginx

作者:Joshua
日期:2009年1月17日凌晨
非请勿转
安装nagios
与别的软件安装稍有不同,nagios的安装要好几步才能完成。

第一步,执行make install安装主要的程序、CGI及HTML文件;
第二步,执行 make install-commandmode 给外部命令访问nagios配置文件的权限;
第三步,执行 make install-config 把配置文件的例子复制到nagios的安装目录。

按照安装向导的提示,其实这里还有一个 make install-init的步骤,它的作用是把nagios做成一个运行脚本,使nagios随系统开机启动,这是一个很方便的措施。

cd /usr/local/
wget http://nchc.dl.sourceforge.net/sourceforge/nagios/nagios-3.0.6.tar.gz
tar xzf nagios-3.0.6.tar.gz
cd nagios-3.0.6
/usr/sbin/useradd nagios
passwd nagios
/usr/sbin/groupadd nagcmd
/usr/sbin/usermod -G nagcmd nagios
./configure --with-command-group=nagcmd
make all
make install
make install-init
make install-config
make install-commandmode
vi /usr/local/nagios/etc/objects/contacts.cfg //修改管理员邮件地址

验证Nagios安装

验证程序是否被正确安装。切换目录到安装路径(这里是/usr/local/nagios),看是否存在 etc、bin、 sbin、 share、 var这五个目录,如果存在则可以表明程序被正确的安装到系统了。五个目录功能的简要说明:

bin Nagios执行程序所在目录,这个目录只有一个文件nagios
etc Nagios配置文件位置,初始安装完后,只有几个*.cfg-sample文件
sbin Nagios Cgi文件所在目录,也就是执行外部命令所需文件所在的目录
Share Nagios网页文件所在的目录
Var Nagios日志文件、spid 等文件所在的目录

安装nagios的插件

没有插件,nagios将什么作用也没有,插件也是nagios扩展功能的强大武器,除了下载常用的插件外,我们还可以根据实际要求编写自己的插件。 Nagios的插件nagios-plugins-1.4.5在www.nagios.org上可以找到,接着我们用wget下载它。注意:插件与 nagios之间的版本关联不大,不一定非得用nagios-plugins-1.4.5这个版本。下载完成后,安装它是很简单的:先执行配置 ./configure –prefix=/usr/local/nagios ,接着编译安装 make ;make install即可。这里需要说明一下的是在配置过程指定的安装路径是/usr/local/nagios,而不是/usr/local/nagios- plus,安装完成后,将在目录/usr/local/nagios生成目录libexec(里面有很多文件),这正是nagios所需要的。

wget http://jaist.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.13.tar.gz
tar xf nagios-plugins-1.4.13.tar.gz
cd nagios-plugins-1.4.13
./configure --with-nagios-user=nagios --with-nagios-group=nagcmd
make
make install

添加Nagios随系统启动

/sbin/chkconfig --add nagios
/sbin/chkconfig nagios on
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/sbin/service nagios start

安装perl fcgi模块

      wget http://www.cpan.org/modules/by-module/FCGI/FCGI-0.67.tar.gz
      tar -zxvf FCGI-0.67.tar.gz
      cd FCGI-0.67
      perl Makefile.PL
      make && make install

安装FCGI-ProcManager:

      wget http://search.cpan.org/CPAN/authors/id/G/GB/GBJK/FCGI-ProcManager-0.18.tar.gz
      tar -xzxf FCGI-ProcManager-0.18.tar.gz
      cd FCGI-ProcManager-0.18
      perl Makefile.PL
      make
      make install
cd /usr/local/nagios/bin/
vi perl-cgi.pl
#!/usr/bin/perl
 
use FCGI;
#perl -MCPAN -e 'install FCGI'
use Socket;
 
#this keeps the program alive or something after exec'ing perl scripts
END() { } BEGIN() { }
*CORE::GLOBAL::exit = sub { die "fakeexit\nrc=".shift()."\n"; }; eval q{exit}; if ($@) { exit unless $@ =~ /^fakeexit/; } ;
 
&main;
 
sub main {
                #$socket = FCGI::OpenSocket( ":3461", 10 ); #use IP sockets
                $socket = FCGI::OpenSocket( "/var/run/nagios.sock", 10 ); #use UNIX sockets - user running this script must have w access to the 'nginx' folder!!
                $request = FCGI::Request( \*STDIN, \*STDOUT, \*STDERR, \%ENV, $socket );
                if ($request) {request_loop()};
                        FCGI::CloseSocket( $socket );
}
 
sub request_loop {
                while( $request->Accept() >= 0 ) {
 
                   #processing any STDIN input from WebServer (for CGI-GET actions)
                   $env = $request->GetEnvironment();
                   $stdin_passthrough ='';
                   $req_len = 0 + $ENV{CONTENT_LENGTH};
                   if ($ENV{REQUEST_METHOD} eq 'GET'){
                                $stdin_passthrough .= $ENV{'QUERY_STRING'};
                        }
 
                        #running the cgi app
                        if ( (-x $ENV{SCRIPT_FILENAME}) && #can I execute this?
                                 (-s $ENV{SCRIPT_FILENAME}) && #Is this file empty?
                                 (-r $ENV{SCRIPT_FILENAME})     #can I read this file?
                        ){
                                #http://perldoc.perl.org/perlipc.html#Safe-Pipe-Opens
                open $cgi_app, '-|', $ENV{SCRIPT_FILENAME}, $stdin_passthrough or print("Content-type: text/plain\r\n\r\n"); print "Error: CGI app returned no output - Executing $ENV{SCRIPT_FILENAME} failed !\n";
                                if ($cgi_app) {print <$cgi_app>; close $cgi_app;}
                        }
                        else {
                                print("Content-type: text/plain\r\n\r\n");
                                print "Error: No such CGI app - $req_len - $ENV{CONTENT_LENGTH} - $ENV{REQUEST_METHOD} - $ENV{SCRIPT_FILENAME} may not exist or is not executable by this process.\n";
                        }
 
                }
}
vi start_ngind_cgi.sh
#!/bin/bash
 
## start_nginx_cgi.sh: start nginx cgi mode
## ljzhou, 2007.08.20
 
 
PERL="/usr/bin/perl"
NGINX_CGI_FILE="/usr/local/nagios/bin/perl-cgi.pl"
 
 
#bg_num=`jobs -l |grep "NGINX_CGI_FILE"`
#PID=`ps aux|grep "perl-cgi"|cut -c10-14|xargs kill -9`
PID=`ps aux|grep 'perl-cgi'|cut -c10-14|sed -n "1P"`
echo $PID
sockfiles="/var/run/nagios.sock"
kill -9 $PID
 
$PERL $NGINX_CGI_FILE &
 
sleep 3
 
`chown nobody.nobody $sockfiles`

创建身份认证文件
注意:这里的用户既是apache的nagios管理界面的登录认证用户也和nagios监控中的权限有关联。

htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

修改nginx.conf文件,在虚拟机中加入:

        location /nagios {
                auth_basic              "Restricted";
                auth_basic_user_file    /usr/local/nagios/etc/htpasswd.users;
        }
        location ~ \.cgi$ {
                root /usr/local/nagios/sbin;
                rewrite ^/nagios/cgi-bin/(.*)\.cgi /$1.cgi break;
                fastcgi_index index.cgi;
                fastcgi_pass    unix:/var/run/nagios.sock;
 
                fastcgi_param   SCRIPT_FILENAME                     /usr/local/nagios/sbin$fastcgi_script_name;
                fastcgi_param   QUERY_STRING                        $query_string;
 
                fastcgi_param   REMOTE_ADDR                        $remote_addr;
                fastcgi_param   REMOTE_PORT                        $remote_port;
                fastcgi_param   REQUEST_METHOD                     $request_method;
                fastcgi_param   REQUEST_URI                        $request_uri;
 
                #fastcgi_param SCRIPT_NAME                        $fastcgi_script_name;
                fastcgi_param   SERVER_ADDR                        $server_addr;
                fastcgi_param   SERVER_NAME                        $server_name;
                fastcgi_param   SERVER_PORT                        $server_port;
                fastcgi_param   SERVER_PROTOCOL                    $server_protocol;
                fastcgi_param   SERVER_SOFTWARE                    nginx;
 
                fastcgi_param   CONTENT_LENGTH                     $content_length;
                fastcgi_param   CONTENT_TYPE                       $content_type;
                fastcgi_param   GATEWAY_INTERFACE                  CGI/1.1;
                fastcgi_param   HTTP_ACCEPT_ENCODING        gzip,deflate;
                fastcgi_param   HTTP_ACCEPT_LANGUAGE        zh-cn;
       }

之后运行/usr/local/nagios/bin/start_nginx_cgi.sh,并将它加到/etc/rc.local中。

/usr/local/nagios/bin/start_nginx_cgi.sh
vi /etc/rc.local ///usr/local/nagios/bin/start_nginx_cgi.sh写在最后一行

重启nginx:

ps -aux|grep nginx
kill -HUP pid //此处的pid是ps查出的nginx主进程id

最后要把nagios的share文件放到虚拟主机根目录下,并将share改名为nagios,最佳办法是:

cd 你的虚拟主机根目录
ln -s /usr/local/nagios/share nagios

访问:http://yourhost.com/nagios
如果看到根我一样的界面,那就对了,图片点击可放大。

nagios成功安装

nagios成功安装

nginx配置 – 含php (fastcgi), perl, proxy, rrd, nagios

Nginx配置文件nginx.conf

 
worker_processes 5;
 
error_log logs/error.log;
error_log logs/error.log info;
 
events {
  use kqueue;
  worker_connections 2048;
}
 
http {
  include mime.types;
  default_type application/octet-stream;
  server_names_hash_bucket_size 64;
 
  log_format main ‘$remote_addr - $remote_user [$time_local] $request ‘
                  ‘”$status” $body_bytes_sent “$http_referer” ‘
                  ‘”$http_user_agent” “$http_x_forwarded_for”‘;
 
  access_log logs/access.log main;
 
  sendfile on;
 
  keepalive_timeout 65;
 
  tcp_nopush on;
 
  upstream proxy {
    server 192.168.0.2:80 weight=2;
    server 192.168.0.3:80;
  }
 
  server {
    listen 80;
    server_name my.example.com 192.168.0.1;
 
    access_log logs/my.example.com.access.log main;
 
    location /status {
      stub_status on;
      access_log off;
      allow 192.168.0.1;
      deny all;
    }
 
    location / {
      root /usr/local/www/status;
      index index.php;
      allow 192.168.100.1;
      deny all;
    }
 
    location ~ \.php$ {
      fastcgi_pass unix:/tmp/php-fastcgi.sock;
      fastcgi_index index.php;
      fastcgi_param SCRIPT_FILENAME /usr/local/www/status$fastcgi_script_name;
      include fastcgi_params;
    }
 
    location /nagios {
      root /usr/local/www;
      allow 192.168.100.1;
      deny all;
    }
 
    location ~ \.cgi$ {
      root /usr/local/www/nagios/cgi-bin;
      rewrite ^/nagios/cgi-bin/(.*)\.cgi /$1.cgi break;
      fastcgi_index index.cgi;
      allow 192.168.100.1;
      deny all;
      fastcgi_pass unix:/tmp/perl_cgi-dispatch.sock;
      fastcgi_param HTTP_ACCEPT_ENCODING gzip,deflate;
      fastcgi_param SCRIPT_FILENAME /usr/local/www/nagios/cgi-bin$fastcgi_script_name;
      include fastcgi_params;
    }
  }
 
  server {
    listen 80;
    server_name proxy.example.com;
 
    access_log logs/proxy.example.com.access.log main;
 
    location / {
      proxy_set_header Host $host;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_pass http://proxy;
    }
  }
}

nginx-rrd.conf

 
#####################################################
#
# dir where rrd databases are stored
RRD_DIR="/var/spool/nginx-rrd";
# dir where png images are presented
WWW_DIR="/usr/local/www/status";
# process nice level
NICE_LEVEL="-19";
# bin dir
BIN_DIR="/usr/sbin";
# servers to test
# server_utl;server_name
SERVERS_URL="http://my.example.com/status;my.example.com http://192.168.0.2/status;2"

fastcgi-php (创建 php fastcgi socket)

. /etc/rc.subr

name="fcgiphp"
rcvar=`set_rcvar`
 
load_rc_config $name
 
: ${fcgiphp_enable="NO"}
: ${fcgiphp_bin_path="/usr/local/bin/php-cgi"}
: ${fcgiphp_user="www"}
: ${fcgiphp_group="www"}
: ${fcgiphp_children="10"}
: ${fcgiphp_port="8002"}
: ${fcgiphp_socket="/tmp/php-fastcgi.sock"}
: ${fcgiphp_env="SHELL PATH USER"}
: ${fcgiphp_max_requests="500"}
: ${fcgiphp_addr="localhost"}
 
pidfile=/var/run/fastcgi/fcgiphp.pid
procname="${fcgiphp_bin_path}"
command_args="/usr/local/bin/spawn-fcgi -f ${fcgiphp_bin_path} -u ${fcgiphp_user} -g ${fcgiphp_group} -C ${fcgiphp_children} -P ${pidfile}"
start_precmd=start_precmd
stop_postcmd=stop_postcmd
 
start_precmd()
{
        PHP_FCGI_MAX_REQUESTS="${fcgiphp_max_requests}"
        FCGI_WEB_SERVER_ADDRS=$fcgiphp_addr
        export PHP_FCGI_MAX_REQUESTS
        export FCGI_WEB_SERVER_ADDRS
        allowed_env="${fcgiphp_env} PHP_FCGI_MAX_REQUESTS FCGI_WEB_SERVER_ADDRS"
# copy the allowed environment variables
        E=""
        for i in $allowed_env; do
                eval "x=\$$i"
                E="$E $i=$x"
        done
        command="env - $E"
 
        if [ -n "${fcgiphp_socket}" ]; then
                command_args=”${command_args} -s ${fcgiphp_socket}”
        elif [ -n "${fcgiphp_port}" ]; then
                command_args=”${command_args} -p ${fcgiphp_port}else
                echo “socket or port must be specified!”
                exit
        fi
}
 
stop_postcmd()
{
        rm -f ${pidfile}
#       eval “ipcs | awk ‘{ if (\$5 == \”${fcgiphp_user}\”) print \”ipcrm -s \”\$2}’ | /bin/sh”
}
 
run_rc_command “$1

perl-fcgi.pl (创建 perl socket) – 你需要FCGI的perl模块支持

#!/usr/bin/perl
 
use FCGI;
#perl -MCPAN -e 'install FCGI'
use Socket;
 
#this keeps the program alive or something after exec'ing perl scripts
END() { } BEGIN() { }
*CORE::GLOBAL::exit = sub { die "fakeexit\nrc=".shift()."\n"; }; eval q{exit}; if ($@) { exit unless $@ =~ /^fakeexit/; } ;
 
&main;
 
sub main {
        #$socket = FCGI::OpenSocket( ":3461", 10 ); #use IP sockets
        $socket = FCGI::OpenSocket( "/tmp/perl_cgi-dispatch.sock", 10 ); #use UNIX sockets - user running this script must have w access to the 'nginx' folde
r!!
        $request = FCGI::Request( \*STDIN, \*STDOUT, \*STDERR, \%req_params, $socket );
        if ($request) { request_loop()};
            FCGI::CloseSocket( $socket );
}
 
sub request_loop {
        while( $request->Accept() >= 0 ) {
 
           #processing any STDIN input from WebServer (for CGI-POST actions)
           $stdin_passthrough ='';
           $req_len = 0 + $req_params{'CONTENT_LENGTH'};
           if (($req_params{'REQUEST_METHOD'} eq 'POST') && ($req_len != 0) ){
                        while ($req_len) {
                            $stdin_passthrough .= getc(STDIN);
                            $req_len--;
                        }
            }
 
            #running the cgi app
            if ( (-x $req_params{SCRIPT_FILENAME}) &&  #can I execute this?
                 (-s $req_params{SCRIPT_FILENAME}) &&  #Is this file empty?
                 (-r $req_params{SCRIPT_FILENAME})     #can I read this file?
            ){
                foreach $key ( keys %req_params){
                   $ENV{$key} = $req_params{$key};
                }
                #http://perldoc.perl.org/perlipc.html#Safe-Pipe-Opens
                open $cgi_app, '-|', $req_params{SCRIPT_FILENAME}, $stdin_passthrough or print("Content-type: text/plain\r\n\r\n"); print "Error: CGI app ret
urned no output - Executing $req_params{SCRIPT_FILENAME} failed !\n";
                if ($cgi_app) {print <$cgi_app>; close $cgi_app;}
            }
            else {
                print("Content-type: text/plain\r\n\r\n");
                print "Error: No such CGI app - $req_params{SCRIPT_FILENAME} may not exist or is not executable by this process.\n";
            }
 
        }
}