I have a setup with 12-13 devices that have several custom EXE/XML sensors each.
Polling interval is 60 seconds, with a timeout of 30 seconds on the scripts. The scripts' execution time is usually 0.5sec to maybe 2 seconds, depending on their complexity. The scripts will not be executed if the host is not reachable (I have programmed a ping check in the start of the scripts to avoid waiting for long timeouts if the host is down).
The number of custom sensors on each device varies; ranging from 1 to 6 sensors.
I've given each device its own mutex name to avoid too many scripts being run at the same time on each individual device.
Everything has been working flawlessly for several months. But the last couple of weeks I keep getting cascading mutex errors. It just gets worse and worse, until all customs scripts on all devices are permanently stuck. It doesn't help to restart the PRTG Core Server and PRTG Probe Service. And the only way to fix it is to restart the system. But it gets just as bad again within a few hours.
I've tried removing all the mutex names, and just let all the script be run at the same time. Doesn't help. A lot of the script will still get stuck, and they produce no data (grey icon).
For troubleshooting, I've also tried giving all the devices' custom sensors the same mutex name, and this does not help either. Mutex timeouts will start to happen within a day.
I really cannot figure out what is happening. Have I misunderstood the whole concept of mutex? Isn't the whole purpose of using mutex names to avoid having too many scripts running at the same time? And yet it does not prevent this problem it seems.
Looking at the Task Manager I can see all the bat scripts are being executed, but they are never finished/terminated. These scripts work like they should if I run them in a cmd shell. Always. They even have a failsafe to terminate if the host is not reachable. What is PRTG suddenly doing differently than compared to running them manually in a shell?
Do you have any tips for troubleshooting this?
Add comment