For everyone elses edification who have been battling this issue, I'll share my findings in my environment.
I've identified that the Cisco System Health Sensors do not honor the globally configured sensor settings for down condition. Although we have sensors configured to warn on first poll failure, Cisco SHS do not. This is a bug being worked on by Paessler.
The more interesting part is what I discovered with Netflow, seems when Cisco Prime Infrastructure runs backup on the device, either triggered by a network engineer making an ad hoc config change or scheduled backups, SSH backups appear to create a block condition on SNMP polling. I have a case open with Cisco TAC on this issue to determine if this is expected behavior. I have a very large and diverse environment gear wise so its basically every class of device that is manifesting this behavior during backup.
Effectively, if PRTG is polling during a SSH backup, the SSH block condition combined with the afore mentioned bug with the Cisco SHS, you're getting an immediate red flare in your monitoring environment...
Add comment