Python写的服务监控程序实例
投稿:junjie
这篇文章主要介绍了Python写的服务监控程序实例,本文直接给出实现代码,需要的朋友可以参考下
前言:
Redhat下安装Python2.7
rhel6.4自带的是2.6, 发现有的机器是python2.4。 到python网站下载源代码,解压到Redhat上,然后运行下面的命令:
复制代码 代码如下:
# ./configure --prefix=/usr/local/python27
# make
# make install
这样安装之后默认不会启用Python2.7,需要使用/usr/local/python27/bin/python2.7调用新版本的python。
而下面的安装方式会直接接管现有的python
复制代码 代码如下:
# ./configure
# make
# make install
开始:
服务子进程被监控主进程创建并监控,当子进程异常关闭,主进程可以再次启动之。使用了python的subprocess模块。就这个简单的代码,居然互联网上没有现成可用的例子。没办法,我写好了贡献出来吧。
首先是主进程代码:service_mgr.py
复制代码 代码如下:
#!/usr/bin/python
#-*- coding: UTF-8 -*-
# cheungmine
# stdin、stdout和stderr分别表示子程序的标准输入、标准输出和标准错误。
#
# 可选的值有:
# subprocess.PIPE - 表示需要创建一个新的管道.
# 一个有效的文件描述符(其实是个正整数)
# 一个文件对象
# None - 不会做任何重定向工作,子进程的文件描述符会继承父进程的.
#
# stderr的值还可以是STDOUT, 表示子进程的标准错误也输出到标准输出.
#
# subprocess.PIPE
# 一个可以被用于Popen的stdin、stdout和stderr 3个参数的特输值,表示需要创建一个新的管道.
#
# subprocess.STDOUT
# 一个可以被用于Popen的stderr参数的特输值,表示子程序的标准错误汇合到标准输出.
################################################################################
import os
import sys
import getopt
import time
import datetime
import codecs
import optparse
import ConfigParser
import signal
import subprocess
import select
# logging
# require python2.6.6 and later
import logging
from logging.handlers import RotatingFileHandler
## log settings: SHOULD BE CONFIGURED BY config
LOG_PATH_FILE = "./my_service_mgr.log"
LOG_MODE = 'a'
LOG_MAX_SIZE = 4*1024*1024 # 4M per file
LOG_MAX_FILES = 4 # 4 Files: my_service_mgr.log.1, printmy_service_mgrlog.2, ...
LOG_LEVEL = logging.DEBUG
LOG_FORMAT = "%(asctime)s %(levelname)-10s[%(filename)s:%(lineno)d(%(funcName)s)] %(message)s"
handler = RotatingFileHandler(LOG_PATH_FILE, LOG_MODE, LOG_MAX_SIZE, LOG_MAX_FILES)
formatter = logging.Formatter(LOG_FORMAT)
handler.setFormatter(formatter)
Logger = logging.getLogger()
Logger.setLevel(LOG_LEVEL)
Logger.addHandler(handler)
# color output
#
pid = os.getpid()
def print_error(s):
print '\033[31m[%d: ERROR] %s\033[31;m' % (pid, s)
def print_info(s):
print '\033[32m[%d: INFO] %s\033[32;m' % (pid, s)
def print_warning(s):
print '\033[33m[%d: WARNING] %s\033[33;m' % (pid, s)
def start_child_proc(command, merged):
try:
if command is None:
raise OSError, "Invalid command"
child = None
if merged is True:
# merge stdout and stderr
child = subprocess.Popen(command,
stderr=subprocess.STDOUT, # 表示子进程的标准错误也输出到标准输出
stdout=subprocess.PIPE # 表示需要创建一个新的管道
)
else:
# DO NOT merge stdout and stderr
child = subprocess.Popen(command,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE)
return child
except subprocess.CalledProcessError:
pass # handle errors in the called executable
except OSError:
pass # executable not found
raise OSError, "Failed to run command!"
def run_forever(command):
print_info("start child process with command: " + ' '.join(command))
Logger.info("start child process with command: " + ' '.join(command))
merged = False
child = start_child_proc(command, merged)
line = ''
errln = ''
failover = 0
while True:
while child.poll() != None:
failover = failover + 1
print_warning("child process shutdown with return code: " + str(child.returncode))
Logger.critical("child process shutdown with return code: " + str(child.returncode))
print_warning("restart child process again, times=%d" % failover)
Logger.info("restart child process again, times=%d" % failover)
child = start_child_proc(command, merged)
# read child process stdout and log it
ch = child.stdout.read(1)
if ch != '' and ch != '\n':
line += ch
if ch == '\n':
print_info(line)
line = ''
if merged is not True:
# read child process stderr and log it
ch = child.stderr.read(1)
if ch != '' and ch != '\n':
errln += ch
if ch == '\n':
Logger.info(errln)
print_error(errln)
errln = ''
Logger.exception("!!!should never run to this!!!")
if __name__ == "__main__":
run_forever(["python", "./testpipe.py"])
然后是子进程代码:testpipe.py
复制代码 代码如下:
#!/usr/bin/python
#-*- coding: UTF-8 -*-
# cheungmine
# 模拟一个woker进程,10秒挂掉
import os
import sys
import time
import random
cnt = 10
while cnt >= 0:
time.sleep(0.5)
sys.stdout.write("OUT: %s\n" % str(random.randint(1, 100000)))
sys.stdout.flush()
time.sleep(0.5)
sys.stderr.write("ERR: %s\n" % str(random.randint(1, 100000)))
sys.stderr.flush()
#print str(cnt)
#sys.stdout.flush()
cnt = cnt - 1
sys.exit(-1)
Linux上运行很简单:
复制代码 代码如下:
$ python service_mgr.py
Windows上以后台进程运行:
复制代码 代码如下:
> start pythonw service_mgr.py
代码中需要修改:
复制代码 代码如下:
run_forever(["python", "testpipe.py"])