Ok,
[we have trouble running the probe service on some devices]
After examining several probe logs we found out the following:
Problem Behavior Case 1 (W7 ent x86): Probe starts to terminate the threads and we can see 13 entries of this in the log file. When it's supposed to kill the rest of the threads it stops and is left in stopped state, no additional entries in log.
Problem Behavior Case 2 (XP pro x86): There is a sudden rise on alerts and message queue on the probe (400-1400 of 3000 sensors, 3k-7k messages in queue). Also at this time the probe memory goes to lowest. After several minutes (up to 20-120 mins) the probe service is able to restart by itself. This seems to be related to avalable memory which keeps declining until the service reboot after that its in original amount. If we use any "memory cleaning" software this seems to prolong this cycle significantly.
Problem Behavior Case 3 (W7 ent x86): Probe starts to terminate the threads and we can see some entries in the log file and in the middle of the thread process: "Trying to restart probe. Reason: connection unresponsive for more than 60 seconds.". The the remaining threads are killed and service is restarted properly.
Overall it seems that in x64 operating systems (xp/w7) the memory is in more stable situation than in x86 systems eg it keeps declining until service reboot. We do not have similar behavior on 2008R2 or 2003 x86 servers.
Add comment