What is this?

This knowledgebase contains questions and answers about PRTG Network Monitor and network monitoring in general.

Learn more

PRTG Network Monitor

Intuitive to Use. Easy to manage.
More than 500,000 users rely on Paessler PRTG every day. Find out how you can reduce cost, increase QoS and ease planning, as well.

Free Download

Top Tags


View all Tags

Direct performance counters stopped working

Votes:

0

After recent network hiccup direct performace counters stopped working. Existing sensors (pure performace counters) report: The wait operation timed out (Performance Counter error 0x102) New sensor (pure performance counters) not possible to create with the same error. Several devices affected I have already checked requirements for performace counters: PRTG probe running on Windows 2008. Remote registry service running on both PRTG probe and target computer Both machines are members of the same domain I have checked credentials used for windows systems. Tried to use another one with domain admin privileges Such performance count exist on target machine. Moreover it's possible to monitor it from another machine.

I can see nothing in PRTG log tab. What else I can check to identify where fault is? Thank you

error-messages performance-counters performancecounter

Created on May 9, 2014 11:05:26 AM



16 Replies

Votes:

0

Hello,

thank you very much for your KB-Post. Please try if any of the tips in our "Performance Counter Troubleshooting Guide" in our KB do help: My Windows sensors do not work when using direct Performance Counter access. What can I do?

Also, please do not hesitate to add any comments to this article, should you find anything helpful for you as well, that's not yet mentioned!

best regards.

Created on May 12, 2014 11:06:35 AM by  Torsten Lindner [Paessler Support]

Last change on Sep 28, 2015 11:38:12 AM by  Torsten Lindner [Paessler Support]



Votes:

0

Hi, That article exactly I've been using to troubleshoot my issue. As you can see from my initial question I've checked all requirements for performance counters and nothing found. Can you advice anything else please?

Created on May 12, 2014 1:51:44 PM



Votes:

0

Try rebooting the target, as well as the PRTG host machine. What happens then if you switch to WMI as preferred data source?

Created on May 12, 2014 2:09:36 PM by  Torsten Lindner [Paessler Support]



Votes:

0

Target machine was already rebooted. At the moment it's not possible to reboot PRTG host as it monitor a lot of other devices. If I choose WMI as preferred data source nothing will change. As far as I understand preferred data source only works with hybrid sensors, which can work either way. I use pure performance counter, for instance "PerfCounter IIS Application Pool BETA" or "PerfCounter Custom". This sensors only works as performance counters. Is that correct? What else I can do besides of rebooting PRTG host? Thank you

Created on May 12, 2014 3:42:41 PM



Votes:

0

You can also try restarting the Probe Service. Or finally, deploy a Remote Probe onto the target and try checking the Performance Counters locally.

Created on May 12, 2014 4:21:52 PM by  Torsten Lindner [Paessler Support]

Last change on May 12, 2014 4:21:59 PM by  Torsten Lindner [Paessler Support]



Votes:

0

Thank you. Reboot of Probe service on PRTG Probe host fixed the issue

Created on May 13, 2014 2:10:57 PM



Votes:

0

Hello, I'm getting the same error and I get a 404 when trying to follow the link above. Restarting the probe service solves it for a short while but the issue reoccurs.

Created on Sep 28, 2015 9:11:31 AM



Votes:

0

Please try the link again, I did correct it.

Created on Sep 28, 2015 11:38:52 AM by  Torsten Lindner [Paessler Support]



Votes:

0

Hi Paessler support team.

We've experienced some similar problems, and i think it could be be a bug in the remote probe part of your software. We'd like your input and potential help in diagnosing the problem. We're running PRTG 15.1.13.1382+

We have about 800 sensors in our installation. I've seen the scenario i'm about to describe unfold twice now with exactly the same symptoms.

Part of our setup is a lot of WMI queries against a series of our windows servers. Lets call these sensors WA1, WA2, WB1, WB2 etc. denoting W (WMI sensor), A (machine A), 1 (sensor number 1). The exact sensor type is: "WMI Vital System Data (V2) sensor" For some other devices we have a some performance counter sensors. We'll call these PX1, PX2, PY1, PY2 etc. denoting P (Perfcounter sensor), X (machine X), 1 (sensor number 1). The exact sensor type is "PerfCounter Custom sensor".

What we've experienced twice now is that some runaway process on a machine against which we run WMI sensors ate all the memory on the machine and then some. This caused the machine to become very "sluggish" and pretty much fail to respond to anything in a timely manner. This caused the WMI sensors to start timing out (for good reason) and this is exactly what we wanted to see because it revealed the problem. So in our error scenario for the sake of argument lets say sensors WB1 & WB2 go into a state of timeout (i believe the exact error was "Message was cancelled by the message filter", i was able to get the same error doing WMI queries against machine B from powershell).

Shortly after this happened we also started to see timeouts on all our performance counter sensors against completely different machines. The error for these sensors were "The wait operation timed out (Performance Counter error 0x102)". I.e. All of the "PerfCounter Custom sensor" PX1, PX2, PY1, PY2 etc. started displaying this timeout.

I went to the machines X,Y,Z etc and verified that there were no problems. The performance counters when viewed locally were fine. I also went to the server running our remote probe and tried to query the performance counters on machines X,Y,Z etc using something like powershells "Get-Counter" or the performance monitor and remote connecting to machines X,Y,Z again no problems were observed. The counters were replying just fine and giving correct values. However PRTG continued displaying timeouts.

In the meantime we booted the server B with the WMI sensors that were timing out, and the WMI sensors WB1 & WB2 started responding again and giving normal values. However our PerfCounter sensors PX1, PX2, PY1, PY2 etc were all still in a state of timeout. I've tried pausing the PerfCounter sensors for up to 30 minutes and resuming them to no avail, they timeout again as soon as they are resumed.

This scenario has played out twice now, and in both cases the only thing i could do to resolve the issue was to restart the PRTG remote probe service completely. That seems to "clear the pipes" so to say, and when the probe comes back online, and manages to catch up with the backlog of queries, then all the perf counter sensors are working fine again.

So the very short version of all this is: Timeouts on WMI sensors in one part of the architecture seems to cause all of our PerfCounter sensors against other parts of the architecture to start timing out as well with the error "The wait operation timed out (Performance Counter error 0x102) " for no apparent reason. The only way I've found to fix the problem is to restart the remote probe which is not very desirable because it leaves us "in the dark" so to say, with regards to monitoring while it catches back up.

Is this scenario something you can replicate in your internal test environments or do you have some potential ideas on what could be causing it?

Best Regards Peter Dahlgaard

Created on Jan 6, 2016 9:49:48 AM



Votes:

0

Peter, I can assure you, the Performance Counter "System" and the WMI System in the PRTG Probe are not connected to each other, they are independent systems, they cannot block each other. The first thing you should really do is checking the scanning intervals, on all the Windows based sensors and set those to at least 5 minutes. Maybe also remove any sensors that are not "really necessary" (and only add them in certain debug-situations), or pause them. These two actions should help already. Please also make sure that the Windows based sensors inherit the scanning interval set on higher level, you can do this with the "Sensors"->"Cross Reference"-Tables, which can be sorted after the Interval-column. The next thing could then indeed be deploying Remote Probes, maybe if the targets are grouped in subnets, or similar, then put a Remote Probe in each of these subnets. The final last resort, would be considering using SNMP to monitor the Windows Hosts, with something like SNMP Informant, where the free version of it, already covers most basic WMI Counters.

Created on Jan 7, 2016 8:08:09 AM by  Torsten Lindner [Paessler Support]



Votes:

0

Hi Torsten

Thank you for your reply. It's good to hear that the two are independent of eachother.

Regarding the scanning interval, by far the majority of our sensors are running on 5 minute scanning intervals. This is the default setting in our setup and nearly all sensors inherit this setting. A few are running on a lower frequency such as once or twice a day, and a few select ones are running on 1 minute intervals, but these are the exception to the rule. We're running all sensors on a single remote probe installed on a machine separated from the core server. Looking at the machine running the probe it doesn't not appear to be overloaded when we inspect it.

I've been looking at the probe health sensor in PRTG and normally we have around 1-5 open requests listed. At the time of the incident when the WMI sensors against a specific host started timing out, the open requests spiked to around 130 and then dropped back to normal values again over a period of 10-15 minutes or so. None of the other values on the probe health display abnormal behavior. There is no WMI interval delay or the likes. It is not my impression that the probe is overloaded in any way. It's also not my impression that there is any kind of network delay or the likes. This is supported by the fact that i can logon to the machine running the probe software, start the performance monitor, connect to the perf counters on the target machines and read them with no delay or problems. The same applies when running "Get-Counter" in powershell from the probe machine against the target machines. Meanwhile the perf counter sensors in PRTG running on the same machine were all timing out with "The wait operation timed out (Performance Counter error 0x102)" regardless of which target machine it was.

Thus i dont believe there's any problem with the machine running the probe software, the target machines or the network itself.

I've seen the behavior twice now. The kind where timeout on WMI sensors in one part of our infrastructure seem to cause timeouts on PerfCounters in PRTG against completely seperate parts of our infrastructure (which aren't really in any state of error/timeout when checked manually outside PRTG). PRTG is simply giving us false alarms. The odd thing to me, and what lead me to assume it may be a problem inside PRTG is that even though the WMI based sensors (which were timing out previously) had started responding again, the perf counter sensors still timed out. The only way i found to resolve the issue was to stop the remote probe service on the machine and restart it (only the service, not the machine itself). As soon as it was restarted the perfcounters all started responding normally again.

If this continues an option could be to use SNMP to monitor the hosts instead, or to simply make a small powershell wrapper script around "Get-Counter" and have that pull the performance counter vased values from remote hosts instead of running it through PRTGs native sensor. It's desirable to us to run it through PRTGs native sensor if possible though, because this way all configuration can be handled easily and in a streamlined manner through the PRTG web UI.

Do you have any other ideas as to what may cause this behavior?

Best Regards Peter Dahlgaard

Created on Jan 7, 2016 10:02:00 AM



Votes:

0

Peter, you could also try installing Remote Probes directly onto the targets. That will mean local WMI / Performance Counter monitoring, which usually also is much less troublesome.

Created on Jan 8, 2016 10:42:04 AM by  Torsten Lindner [Paessler Support]



Votes:

0

Good day, I also experience this behavior. All my App Pool sensors stop working at the same time for unknown reason and the only way I found to fix it is to restart the probe. Happened already 3 times in last 1.5 months.

Probe: PRTG Network Monitor 16.1.22.2658 x64, running on Win2012 R2 Clients: on multiple remote sites

Best regards Tomas

Created on Apr 5, 2016 2:59:00 AM



Votes:

0

Hello Tomas,

Which scanning interval is configured for these sensors and how many WMI sensors in total are running on the same probe? Did you already go through this guide?

Kind regards.

Created on Apr 5, 2016 2:46:31 PM by  Erhard Mikulik [Paessler Support]



Votes:

0

Hi Erhard,

one probe, total 224 WMI sensors, 17 App pool sensors, some are in 5min interval, others 30 sec. Some of the sensors are on remote site, others are in local datacenter ("close" to the probe) The App pool sensors dropped again this morning and restarting the probe fixed it.

Thank you Tomas

P.S. Shall I create service call?

Created on Apr 7, 2016 2:47:21 AM



Votes:

0

Hi Tomas,

Yes, that would be best so we can take a look at the logs. Send us a Support Bundle from within PRTG, you can use PAE683491 as ticket-id for reference. In case those sensors are running on a Remote Probe, open the PRTG Administration Tool on that Remote Probe in order to send us its logs (see tab Logs and Info).

Kind regards.

Created on Apr 7, 2016 11:06:53 AM by  Erhard Mikulik [Paessler Support]




Disclaimer: The information in the Paessler Knowledge Base comes without warranty of any kind. Use at your own risk. Before applying any instructions please exercise proper system administrator housekeeping. You must make sure that a proper backup of all your data is available.