Hi, there seems to be a problem within sensors in windows failover cluster environments. I configured some sensors (services, harddisk usage,...) on the cluster ip address (of the specified cluster services, NOT the physical node!) when i perform a failover to the 2nd node all sensors show failures as they cannot find the specified drives and services anymore. any ideas?
Sensors in Windows failover cluster environment
Votes:
0
Best Answer
Votes:
0
The logic and behavior of WMI sensors was massively changed with the PRTG Versions 17.1.28 and later:
Essentially, WMI connections (Which are DCOM based) are kept alive for as long as possible, preferably forever. A new connections (which will then re-resolve the virtual ip) will only happen if the connection is broken.
Created on Mar 28, 2017 8:38:08 AM by
Luciano Lingnau [Paessler]
Last change on Mar 28, 2017 8:39:30 AM by
Luciano Lingnau [Paessler]
10 Replies
Votes:
0
Hello,
I'm afraid that can't be changed, due to the way the inner workings of WMI are implemented. The reason is, if the cluster switches, then even though the IP stays the same, the MAC address changes, and unfortunately WMI connections are linked to the MAC-Address. PRTG establishes a new connection every 60 minutes, and this is probably the time you had to wait. But this time will most likely not be changed, as establishing a WMI Connection is unfortunately horribly slow (over 20s), so we can't do that before every sensor scan.
We would recommend to monitor with "hardware-near" sensors directly on the physical nodes (not via the cluster-IP), and then use service-sensors (HTTP, SMB-Sensors, etc..) to check the clustered service.
Best regards.
Votes:
0
Hello Torsten, i unfortunately can´t do that. Means I can not monitor the hardware itself because there are a couple of sql server instances and belonging windows services running. Each SQL server instance has an own ip address and an own hostname of course. So when the mentioned services are stopped on the first node and started on the second then ia have to monitor these processses on the second. and i can do this only by monitoring the ip address of the cluster services.
if i would use sensors to monitor these services on the machine itself i would get failures on that machine that is currently NOT hosting that services.
Votes:
0
@Torsten:
FYI: i just performed another failover and see that all services are monitored correctly but my "hardisk-usage-sensor" is still erroneous.
Votes:
0
The suggestion was, to monitor services, like the SQL via the Cluster IP, with SQL-Sensors (not WMI SQL Sensors, because they will face the same issue), and then all hardware sensors, like harddisk-usage on the physical nodes. With WMI, there is no other way, sorry. But that is not PRTGs fault.
Votes:
0
I can not cover the harddisk-usage on the physikal disk because that disk is not part of that node anymore...
Votes:
0
as you wrote PRTG establishes a new connection every 60 minutes. i was waiting almost 120 minutes but the status of that sensor ist sill "erroneous"
any ideas? Thank you!
Votes:
0
the funny thing is when i replace the configured hostname with the cluster ip address all service states show errors again. placing the hostname again everything is green but the harddisk-sensor...
Votes:
0
OK, i got it. As mentioned in PRTG i had to use the drivletter and not the ID. Everything is working fine now :-)
Votes:
0
Changing the Hostname/IP actually also creates a new WMI Connection.
Votes:
0
The logic and behavior of WMI sensors was massively changed with the PRTG Versions 17.1.28 and later:
Essentially, WMI connections (Which are DCOM based) are kept alive for as long as possible, preferably forever. A new connections (which will then re-resolve the virtual ip) will only happen if the connection is broken.
Created on Mar 28, 2017 8:38:08 AM by
Luciano Lingnau [Paessler]
Last change on Mar 28, 2017 8:39:30 AM by
Luciano Lingnau [Paessler]
Add comment