This article applies as of PRTG 22
Mutex timeouts (code: PE035): Workaround
PRTG calculates the point in time of a sensor scan with a specific algorithm. To avoid overloading a device, all requests to one device are distributed as much as possible.
However, if many sensors are distributed among many different devices, for example, one dedicated device for each folder in an OWA mailbox, the algorithm plans all requests for the same second. Of course, this approach results in issues in some cases.
A workaround for this issue is to cheat the algorithm a little bit.
- The scanning distribution always begins with the first sensor on a device.
- When you add other sensors to a device before certain mutex sensors, the mutex sensors are scanned later.
- For example, create some Green IT sensors as dummies. This sensor type does nothing but show the Up status, and therefore must be active.
- The maximum scanning difference between two sensors on one device is five seconds. Thus, scanning succeeding sensors is postponed by five seconds per dummy sensor (if only very few sensors are on the same device).
If you apply this tweak to half of your EXE sensors, for example, it might be help enough because these sensors are scanned five seconds after the other sensors with the same mutex.
EXE sensors with mutex and corresponding timeouts
Regarding EXE sensors, the mutex waits three times the timeout. You can define the timeout in the sensor settings. The maximum timeout for mutex is 15 minutes. This is because after 20 minutes, a monitoring thread is killed the hard way.
If you have 100 sensors with six seconds of runtime each, this equals 10 minutes runtime in total, and a timeout of four minutes would be appropriate. Then, you would have a total timeout of 16 minutes:
4*3 (mutex) + 4 min (timeout)
This is enough difference to the maximum runtime of 20 minutes yet allows all sensors to wait up to 12 minutes.
Add comment