Auto restart by log event

De Wiki Clusterlab.com.br
Revisão de 16h41min de 28 de dezembro de 2022 por Damato (discussão | contribs)
Ir para navegação Ir para pesquisar

If an error happen in the last 5 minutes, 300 seconds, of events in "/var/log/message" the routine will restart the service.

# ./myscript.sh <number of seconds>
$ ./myscript.sh 300

The script convert date end time to EPOCH and get the current EPOCH then subtract for number of seconds to look for an error in the log file.

#!/bin/bash

GET_COUNT() {
    cat /var/log/messages | \
        egrep "ERROR - my specific message of fail 1|ERROR - my specific message of fail 2" | \
        awk '{ print $1 " " $2 " "$3 } ' | \
        while read -r line
        do 
            date -d "$line" +"%s"
        done | \
        while read EPOCH
        do
            if [ $EPOCH -gt $1 ]
            then
                echo $EPOCH
            fi
        done | wc -l | awk '{print $1} '
}
RESTART_APP() {
    date
    systemctl restart MYAPP.service
    echo service restarted
}
export WINDOW=$1
export NOW=$(date +"%s")
export SINCE=$(expr $(date +"%s") - $WINDOW)


if [ $(GET_COUNT $SINCE) -gt 0 ]
then    
    RESTART_APP
fi