Devices and groups in PRTG have the ability to input a maintenance window which pauses monitoring and alternatively the "Pause Until" option can be used in a similar way for sensors/devices/groups. For our circumstances, however, we most commonly want to observe the status of our monitoring as maintenance occurs without triggering notifications so that we can be sure all devices/services come back online as expected.
In order to accomplish this we've found a couple options:
- Remove the inherited notification from the probe group for the duration of the maintenance window but there are issues with this:
- This works only if all notifications are inherited
- If the notification isn't added back by mistake, no notifications will be delivered until someone randomly realizes the mistake
- Modify schedule to block notifications during the maintenance window
- This is much better but requires individual schedules and notification templates for each probe to avoid pausing notifications for probes sharing the same notification or notifications using the same schedule
- Again, if the schedule isn't reverted by mistake there is the potential to miss notifications during the previously configured window
- Pause notifications
- This requires individual notification templates for each probe which is not a bad policy anyway
- Just like the above, if the notification template isn't unpaused by mistake there are no notifications until it is discovered
While the issues with most of these are largely a process issue our organization is trying to remove as much risk of little misses and mistakes like these as possible. When the "Pause Until" option was introduced to sensors/devices/groups this was a game changer and we immediately made it a policy to never use the "Pause Indefinitely" option again without documented approval.
If it is possible to introduce the "Pause Until" option to the Notification Templates this would allow us to be certain that we won't get flooded with tickets during maintenance windows (or when maintenance windows expire) while still being able to observe an environment to make sure everything comes back up as expected. Also, if there are any other options you can suggest to accomplish this same goal we are all ears.
Add comment