运维经验分享作为一个专题,目前共7篇文章
《运维经验分享(一)-- Linux Shell之ChatterServer服务控制脚本》
《运维经验分享(二)-- Linux Shell之ChatterServer服务控制脚本二次优化》
《运维经验分享(三)-- 解决Ubuntu下crontab不能正确执行Shell脚本的问题(一)》
《运维经验分享(四)--关于 java进程管理的服务控制脚本编程思路分析》
《运维经验分享(五)-- 改进的java进程管理的服务控制脚本》
《运维经验分享(六)-- 深究crontab不能正确执行Shell脚本的问题(二)》
《运维经验分享(七)-- Linux Shell之ChatterServer服务控制脚本第三次优化》
====================================分割线======================================
本脚本是ChatterServer的服务控制脚本,即能通过service命令来控制ChatterServer的启动、停止、重新启动以及状态查看,就像mysql有/etc/init.d/mysql或/etc/init.d/mysqld一样,只是这个ChatterServer的服务控制脚本写起来更加困难,有些信息的捕获和判断更加复杂,原因还是主要与ChatterServer的运行方式和启动过程中发生的一些事情有关,这个在脚本的注释中已经充分的标注清楚了。
由于ChatterServer运行在Ubuntu上,因此跟CentOS还是有很大的差异,例如sleep 1和usleep 100000上,CentOS能控制更精确的延时,而Ubuntu不能支持usleep因此不能控制更精确的延时,导致潜在的性能问题。
为方便阅读,在此简单举几个问题例子:
1.ChatterServer的java核心命令行执行后,可能是由于程序或者性能原因,会延迟建立端口连接;
2.ChatterServer的jar包里面用到了一些配置文件,在特定的目录下,必须先进入这些目录才能执行java核心命令行;
问题解决办法:
针对第一个问题,采用多重判断和延时执行的方法;
针对第二个问题,调试时带来问题较多,后来才与开发沟通,弄明白其原因;
此脚本还根据开发的需求,将ChatterServer日志按照日期保存以便于在服务状态发生改变时不会新的日志覆盖,将脚本运行日志保存到新的日志,共管理查看。
以下是脚本第四版内容:
#!/bin/bash #chkconfig: 345 86 14 #description: Startup and shutdown script for ChatterServer VERSION=1.0.0-snapshot BASEDIR=/data/chatterserver LOGDIR=$BASEDIR/logs SERVICEPORT=29092 PIDFILE=$BASEDIR/pid/chatter.pid SERVER=$BASEDIR/chatter-$VERSION\.jar BASENAME=chatter # -Xms2g -Xmx2g -Xmn2g -Xss128k -XX:MaxPermSize=64m -XX:-UseParallelGC -XX:+UseParallelOldGC -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=30 -XX:SurvivorRatio=6 ARGS="" status() { # The judgment priority: pid > port > piffile # netstat run by common user will get some error output, so we put those error outout to /dev/null if [[ $(netstat -anop 2>/dev/null | grep $SERVICEPORT | grep LISTEN) || -f $PIDFILE ]];then #pid=$(cat $PIDFILE) pid=$(ps -ef | grep java | grep $BASENAME | grep -v grep | awk '{print $2}') if [[ $pid != "" && $(ps -ef | grep $pid | grep -v grep) ]]; then echo "SUCCESS: ChatterServer is OK" exit 0 else echo "ERROR: ChatterServer pid is NOT exist" exit 1 fi elif [[ ! $(netstat -anop 2>/dev/null | grep $SERVICEPORT | grep LISTEN) ]]; then echo "ERROR: ChatterServer port is NOT listen" exit 1 elif [[ ! -f $PIDFILE ]]; then echo "ERROR: ChatterServer pid file is NOT exist" exit 1 else echo "ERROR: ChatterServer is NOT running" exit 1 fi } start() { if [[ -e $PIDFILE ]]; then echo "ERROR: pidfile $PIDFILE exist, server has started with pid $(cat $PIDFILE)" # pid file can be deleted /bin/rm -f $PIDFILE exit 1 fi if [[ -e $SERVER ]]; then echo "INFO: Starting ChatterServer" # Start ChatterServer core daemon # Why using "date +"%Y%m%d""? Because we just need restart this once per day # For ChatterServer wiil find some file in $BASEDIR cd $BASEDIR #nohup java -jar $SERVER $ARGS >>$LOGDIR/console-$(date +"%Y%m%d").out 2>&1 & java -jar $SERVER $ARGS >>$LOGDIR/console-$(date +"%Y%m%d").out 2>&1 & #java -jar $SERVER $ARGS >$LOGDIR/console.out 2>&1 & RETVAL=$? # shell do NOT need home directory ## For ChatterServer wiil find some file in $BASEDIR #cd if [[ $RETVAL -eq 0 ]]; then ## $! --> Expands to the process ID of the most recently executed background (asynchronous) command. #echo $! > $PIDFILE # For java performance issue, port 29092 will listen latter, we will waiting for 2 second sleep 2 # get pid var # TODO remove debug info #echo "DEBUG: " #ps -ef | grep $BASENAME | grep -v grep | awk '{print $2}' # end debug pid=$(ps -ef | grep java | grep $BASENAME | grep -v grep | awk '{print $2}') # send pid number to pid file echo $pid > $PIDFILE # Those lines will remove in next release # TODO remove debug info #echo "DEBUG: live 1" # For java performance issue, port 29092 will listen latter, so we change judgment conditions if [[ $(netstat -anop 2>/dev/null | grep $SERVICEPORT | grep LISTEN) || -f $PIDFILE ]]; then echo "SUCCESS: ChatterServer start OK" # Setting up start log echo "[ $(date +"%D %T") ] SUCCESS: ChatterServer started with pid $(cat $PIDFILE) " >>$LOGDIR/service.log fi # TODO remove debug info #echo "DEBUG: live 2" # -Those lines will remove in next release #echo "SUCCESS: ChatterServer start OK" ## Setting up start log #echo "[ $(date +"%D %T") ] SUCCESS: ChatterServer started with pid $(cat $PIDFILE) " >>$LOGDIR/service.log else echo "ERROR: ChatterServer start failed" # Setting up start log echo "[ $(date +"%D %T") ] ERROR: ChatterServer start failed " >>$LOGDIR/service.log exit $RETVAL fi else echo "ERROR: Couldn't find $SERVER" # TODO We just think this is not essential # Do NOT setting up log here exit 1 fi } stop() { if [[ -e $PIDFILE ]]; then pid=$(cat $PIDFILE) #if kill -TERM $PIDFILE >/dev/null 2>&1 # TODO remove debug info #echo "DEBUG: $LOGDIR/console-$(date +"%Y%m%d").out" # Ubuntu can NOT use "usleep", so use "sleep" instead # usleep 100000 if kill -TERM $pid >>$LOGDIR/console-$(date +"%Y%m%d").out && sleep 1 then echo "SUCCESS: ChatterServer stop OK with TERM" # Setting up stop log echo "[ $(date +"%D %T") ] SUCCESS: ChatterServer stop OK with TERM " >>$LOGDIR/service.log # Because we can NOT use usleep , so we must comment out sleep 1 next #sleep 1 # Ubuntu can NOT use "usleep", so use "sleep" instead # usleep 100000 elif kill -KILL $pid >/dev/null 2>&1 && sleep 1 then echo "SUCCESS: ChatterServer stop OK with KILL" # Setting up stop log echo "[ $(date +"%D %T") ] SUCCESS: ChatterServer stop OK with KILL " >>$LOGDIR/service.log # Because we can NOT use usleep , so we must comment out sleep 1 next #sleep 1 else echo "ERROR: ChatterServer stop faild" # Setting up stop log echo "[ $(date +"%D %T") ] ERROR: ChatterServer stop failed " >>$LOGDIR/service.log exit 1 fi # Remove pid file if [[ -f $PIDFILE ]]; then /bin/rm -f $PIDFILE fi else echo "ERROR: No ChatterServer running" # TODO We just think this is not essential # Do NOT setting up log here exit 1 fi } restart() { echo "INFO: Restarting ChatterServer" stop # Those lines will remove in next release if [[ $(netstat -anop 2>/dev/null | grep $SERVICEPORT | grep LISTEN) ]]; then echo "WARNNING: port $SERVICEPORT is in using, must waiting" sleep 5 if [[ $(netstat -anop 2>/dev/null | grep $SERVICEPORT | grep LISTEN) ]]; then echo "WARNNING : port $SERVICEPORT is still in using, must waiting" sleep 2 fi fi # -Those lines will remove in next release # Do NOT using sleep any seconds here with stop() function used start } case $1 in status) status ;; start) start ;; stop) stop ;; restart) restart ;; help|*) echo "Usage: $0 {status|start|stop|restart|help} with $0 itself" echo "Usage: service chatter {status|start|stop|restart|help} with service" exit 1 ;; esac # replace "exit 0" with ":" #exit 0 :--end--
====================================分割线======================================
运维经验分享作为一个专题,目前共7篇文章
《运维经验分享(一)-- Linux Shell之ChatterServer服务控制脚本》
《运维经验分享(二)-- Linux Shell之ChatterServer服务控制脚本二次优化》
《运维经验分享(三)-- 解决Ubuntu下crontab不能正确执行Shell脚本的问题(一)》
《运维经验分享(四)--关于 java进程管理的服务控制脚本编程思路分析》
《运维经验分享(五)-- 改进的java进程管理的服务控制脚本》
《运维经验分享(六)-- 深究crontab不能正确执行Shell脚本的问题(二)》
《运维经验分享(七)-- Linux Shell之ChatterServer服务控制脚本第三次优化》