What is this?

This knowledgebase contains questions and answers about PRTG Network Monitor and network monitoring in general.

Learn more

PRTG Network Monitor

Intuitive to Use. Easy to manage.
More than 500,000 users rely on Paessler PRTG every day. Find out how you can reduce cost, increase QoS and ease planning, as well.

Free Download

Top Tags


View all Tags

PRTG cluster - Fail over

Votes:

0

The other day I stopped started the PRTG services on the Primary node of my PRTG cluster. The Secondary node became active, starting to do the alerting, nice an smoothly.

When I started the PRTG services on the Primary node again, PRTG immediately failed back to this node. However, it took the node over 30 minutes to check all the sensors, resulting in a lot of Business Service sensors to go red.

Is there a way to disable the manual fail-back of the cluster? Because if I can manually fail back, I can do this when the Primary node has checked all sensors again.

I know I can manually fail-over to the Secondary node if I need to do maintenance on the Primary node. But I'm now talking about the Primary node for example crashing (BSOD) and automatically restarting. I really don't want this to cause an alert storm due to not all sensors checked yet on the Primary node in case of automatic fail-back...

Kind regards,

Corné van den Bosch

cluster fail-over prtg

Created on Mar 17, 2017 10:09:45 AM



2 Replies

Votes:

0

Hello and thank you for your KB-Post,

Could it be that this is a fairly large deployment? Please the performance constrains regarding Clusters:

Within PRTG's there's no way of controlling this. The node with the highest priority will automatically "re-take control of the cluster" as soon as it starts.

As a workaround, you could configure the PRTG Core Server Service to only start manually. This way the probe service will always "resume" automatically, which means that the failover will start getting data from both nodes again but the core server service (and the "primary cluster node") will only come back when you manually command a service start.

This also means that when the Core Server Starts again, the probe will have been running for a while and should have already polled all sensors at least once, meaning that the Core wouldn't have to wait for all sensors to slowly resume and start getting their data again.

You could also use a Windows Scheduled Task to postpone the start of the Core Server Service in a specific time after the system starts.

Best Regards,
Luciano Lingnau [Paessler Support]

Created on Mar 20, 2017 12:41:48 PM by  Luciano Lingnau [Paessler]



Votes:

0

Hello Luciano,

Not that big. Just a two-node PRTG cluster with not even 5k sensors.

The trick with the manually start PRTG Core Server Service is a good one! I'll configure it on both nodes right away.

Especially because we're working on automatically creating tickets in our ticket system as soon as a sensor goes Red, I can't use this sea of red sensors when the Primary Master reboots (it happens; maintenance and such); especially the Business Service sensors are quite sensitive to this...

But this trick can help me avoid this. Thank you for the tip!

Kind regards,

Corné van den Bosch

Created on Mar 21, 2017 8:02:12 AM




Disclaimer: The information in the Paessler Knowledge Base comes without warranty of any kind. Use at your own risk. Before applying any instructions please exercise proper system administrator housekeeping. You must make sure that a proper backup of all your data is available.