I have two SQL databases running on my Linux machine. Is there a way I can monitor if data replication was successful?
How can I monitor MySQL replication on a Linux machine?
Votes:
0
4 Replies
Votes:
1
This article applies to PRTG Network Monitor 12 or later
Monitoring MySQL Data Replication
This can be achieved using a custom script on your Linux server. The script will check if the replication was successful, then returning the respective result on an http web page.
Please make sure this script can be run by calling a certain URL. It should return a web page showing numeric status results in the format, for example, like this:
[3][0][0][0]
Once this URL works, use it in combination with PRTG's HTTP Content Sensor, in order to integrate the replication check in your existing monitoring. With each sensor scan the script the URL will be called, thus the script will run, check for successful replication, and return the simple web page shown above.
Example Script: Python
In the following we provide a sample Python script that comes with absolutely no warranty! We provide it for your information to illustrate the general concept. You will have to adapt the script to fit your own needs.
# -*- coding: utf-8 -*- """ paessler.utils.system.service ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ :Authors: Michael Zeidler :Organization: Paessler AG :Copyright: 2012 Paessler AG """ # Imports # ------- import re import shlex import logging import subprocess from django.db import connections from itertools import izip # Constants # --------- logger = logging.getLogger('paessler') LIMIT_OUTOF_SYNC_SECONDS = 60*5 # as the monitoring check is executed every 5 minutes MYSQL_SHOW_SLAVE_STATUS = 'SHOW SLAVE STATUS' MYSQL_SHOW_MASTER_STATUS = 'SHOW MASTER STATUS' def get_mysql_master_replication_status(): cursor = connections['replication_master'].cursor() cursor.execute(MYSQL_SHOW_MASTER_STATUS) row = cursor.fetchone() col_names = [desc[0] for desc in cursor.description] return dict(izip(col_names, row)) def get_mysql_slave_replication_status(): cursor = connections['replication_slave'].cursor() cursor.execute(MYSQL_SHOW_SLAVE_STATUS) row = cursor.fetchone() col_names = [desc[0] for desc in cursor.description] return dict(izip(col_names, row)) class MySQLDatabaseReplicationValidation(object): def validate(self): """Validates the slave, master and slave-master""" errors = {} master_res, master_errors = self.validate_master() slave_res, slave_errors = self.validate_slave() errors.update(master_errors) errors.update(slave_errors) if master_res and slave_res: master_position = master_res['Position'] master_binlog_file = master_res['File'] slave_master_log_file = slave_res['Master_Log_File'] slave_read_master_log_position = slave_res['Read_Master_Log_Pos'] if master_binlog_file != slave_master_log_file: errors.update({'MASTER_SLAVE_ERROR': "master_log_file '%s' != slave_log_file: '%s'" % (master_binlog_file, slave_master_log_file)}) return errors def validate_master(self): errors = {} try: res = get_mysql_master_replication_status() except Exception, e: errors = {'MASTER_CONNECTION_ERROR': str(e)} return None, errors position = res['Position'] binlog_do_db = res['Binlog_Do_DB'] binlog_file = res['File'] binlog_ignore_db = res['Binlog_Ignore_DB'] return res, errors def validate_slave(self): errors = {} try: res = get_mysql_slave_replication_status() except Exception, e: errors = {'SLAVE_CONNECTION_ERROR': str(e)} return None, errors slave_io_state = res['Slave_IO_State'] slave_io_running = res['Slave_IO_Running'] slave_sql_running = res['Slave_SQL_Running'] master_log_file = res['Master_Log_File'] read_master_log_position = res['Read_Master_Log_Pos'] last_errno = res['Last_Errno'] last_error = res['Last_Error'] last_io_errno = res['Last_IO_Errno'] last_io_error = res['Last_IO_Error'] last_sql_errno = res['Last_SQL_Errno'] last_sql_error = res['Last_SQL_Error'] seconds_behind_master = res['Seconds_Behind_Master'] if seconds_behind_master > LIMIT_OUTOF_SYNC_SECONDS: errors.update({'SLAVE_OUT_OF_SYNC_LIMIT_EXCEEDED':seconds_behind_master}) if slave_io_running != 'Yes': errors.update({'SLAVE_IO_RUNNING': False}) if slave_sql_running != 'Yes': errors.update({'SLAVE_IO_RUNNING': False}) if last_errno > 0: errors.update({'SLAVE_ERROR': "%d : %s" % (last_errno, last_error)}) return res, errors
Created on Aug 15, 2012 2:35:00 PM by
Daniel Zobel [Product Manager]
Last change on Sep 5, 2012 9:12:17 AM by
Michael Zeidler [Paessler Support]
(112)
●3
●1
Votes:
1
It's possible to use a MySQL sensor to do a query. Use the SQL command "select variable_value from information_schema.global_status where variable_name='slave_running';". It's returns "ON" or "OFF", where "ON" means both the SQL thread and the IO thread are running, and "OFF" means either the SQL thread or the IO thread is not running. http://blog.webyog.com/2012/11/20/how-to-monitor-mysql-replication
Votes:
0
Here is another possibility written in perl that is a little simpler
#!/usr/bin/perl -w use strict; use DBI; # Script to test MySQL replication by TRD 08/01/2008 print "Content-type: text/html\n\n"; #Variables my $dbServer="DBI:mysql:database=dbname;host=localhost"; my $dbUser="dbuser"; my $dbPwd="dbpass"; #Connect to DB my $db=DBI->connect($dbServer,$dbUser,$dbPwd) || exit 2; my $rs=$db->prepare('show slave status'); $rs->execute(); my $dr=$rs->fetchrow_hashref(); my $Slave_IO_Running=$dr->{'Slave_IO_Running'}; my $Slave_SQL_Running=$dr->{'Slave_SQL_Running'}; $rs->finish(); $db->disconnect(); if (($Slave_IO_Running eq 'Yes') && ($Slave_SQL_Running eq 'Yes')) { print "[1]"; exit 0; } else { print "[0]"; exit 0; }
This will return [1] if the replication is running and [0] if replication is broken and you can use an advanced http sensor to monitor the content of the page.
Created on Dec 15, 2014 9:09:57 PM
Last change on Dec 16, 2014 7:58:33 AM by
Konstantin Wolff [Paessler Support]
Votes:
0
I created a bash script to deal with this question with some further features starting with a gist from ssimpson89 on github. You can find it with my adaptations here: https://gist.github.com/flickerfly/c64888f3c4de28366869ec6010f94f96d - This also will email you if you want it to, but you have PRTG so why would you worry about that? :-)
For ease of access, here is the current revision. The above link may have an improved version from the future.
### VARIABLES ### \ SERVER=$(hostname) MYSQL_CHECK=$(mysql -e "SHOW VARIABLES LIKE '%version%';" || echo 1) SLAVE_STATUS=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G"|grep -v row) LAST_ERRNO=$(echo "${SLAVE_STATUS}" | grep "Last_Errno:" | awk '{ print $2 }' ) SECONDS_BEHIND_MASTER=$(echo "${SLAVE_STATUS}" | grep "Seconds_Behind_Master:" | awk '{ print $2 }' ) IO_IS_RUNNING=$(echo "${SLAVE_STATUS}" | grep "Slave_IO_Running:" | awk '{ print $2 }' ) SQL_IS_RUNNING=$(echo "${SLAVE_STATUS}" | grep "Slave_SQL_Running:" | awk '{ print $2 }' ) SLAVE_IO_STATE=$(echo "${SLAVE_STATUS}" | grep "Slave_IO_State:" | awk -F':' '{gsub(/^[ \t]+/,"",$2);gsub(/[ \t]+$/,"",$2); print $2 }' ) ERRORS=() #echo "${SLAVE_STATUS}" ### Run Some Checks ### ## Check if I can connect to Mysql ## if [ "$MYSQL_CHECK" == 1 ] then ERRORS=("${ERRORS[@]}" "Can't connect to MySQL (Check Pass)") fi ## Check For Last Error ## if [ "$LAST_ERRNO" != 0 ] then ERRORS=("${ERRORS[@]}" "Error when processing relay log (Last_Errno = ${LAST_ERRNO})") fi ## Check if IO thread is running ## if [ "$IO_IS_RUNNING" != "Yes" ] then ERRORS=("${ERRORS[@]}" "I/O thread for reading the master's binary log is not running (Slave_IO_Running)") fi ## Check for SQL thread ## if [ "$SQL_IS_RUNNING" != "Yes" ] then ERRORS=("${ERRORS[@]}" "SQL thread for executing events in the relay log is not running (Slave_SQL_Running)") fi ## Check how slow the slave is ## if [ "$SECONDS_BEHIND_MASTER" == "NULL" ] then ERRORS=("${ERRORS[@]}" "The Slave is reporting 'NULL' (Seconds_Behind_Master)") elif [ "$SECONDS_BEHIND_MASTER" -gt 60 ] then ERRORS=("${ERRORS[@]}" "The Slave is at least 60 seconds behind the master (Seconds_Behind_Master)") fi ### Send and Email if there is an error ### if [ "${#ERRORS[@]}" -gt 0 ] then MESSAGE_PRTG="$(for i in $(seq 1 ${#ERRORS[@]}) ; do echo "\t${ERRORS[$i]}\n" ; done)" MESSAGE_EMAIL="An error has been detected on ${SERVER} involving the mysql replciation. Below is a list of the reported errors:\n\n ${MESSAGE_PRTG} Please correct this ASAP" #echo -e $MESSAGE_EMAIL #| mail -s "Mysql Replication for $SERVER is reporting Error" ${EMAIL} echo "2:${SECONDS_BEHIND_MASTER}:${MESSAGE_PRTG}" else echo "0:${SECONDS_BEHIND_MASTER}:${SLAVE_IO_STATE}" fi
Created on Sep 14, 2016 11:09:29 PM
Last change on Sep 15, 2016 5:04:55 AM by
Luciano Lingnau [Paessler]
Add comment