We are gradually bringing our implementation of PRTG live and are currently enabling notifications for our sensors, although the sensors themselves have been active for nearly two months.
We have four Cloud HTTP sensors monitoring customer facing sites with a simple Get, and notifications set with basic Down state and threshold triggers. Scanning intervals are set to 10 minutes, and Timeout to the maximum supported value of 5 seconds.
Since enabling the down state notification we have noticed that one or more of our four sensors occasionally return socket error # 10060 for one or more scanning intervals. Checking the actual site when the sensor reports the error shows that there is no issue with the actual site.
Reviewing the historical data for one such sensor shows that these occur infrequently but regularly enough to be a concern, and occasionally for extended periods of time (to a maximum of 90 minutes in July).
Today (26th August) we've had 4 such occurrences between 8:30 am and 11:21 am, prompting me to ask the following questions:
1. Is this expected behaviour at present with these sensors?
2. Is there something I can do to mitigate the issue?
NB: We do also have external monitoring in place for these sites (running from a raspberry pi at my house) but I was hoping to be able to prioritise in-house monitoring using PRTG.
Add comment