I have 2 servers that have a down status for physical drives. The disk with serial number could not be found (code:PE156). How do I clear this as both drives are fine?
Did you replace the disk in case or moved the PRTG probe to an other machine?
I think the only solution is to drop the sensor and add a new one.
I had this issue on two Physical disks on the same device. They were working fine, until the server reboot due to Windows updates. Then two didn't work, the other 14 worked just fine. It was on a Dell PowerEdge server and they were only two SSDs in the server (the rest were standard SATA and SAS). I thought the type of disk had something to do with it, but in the end it didn't. I tried upgrading RAID drivers and firmware, and that didn't fix it either.
Then I noticed the serial name listed in the error message (see below).
Error: The disk with serial number "425026040000ags/T43P" could not be found (code: PE156)
But this 'serial number' didn't match with I could see in iDRAC. The ones that were working correctly had the correct serial number. I tried to change it in PRTG (maybe there is a way, but I couldn't find it). So I removed the sensor and re-added the same disk, and sure enough it works, and with the correct serial number! I don't know why or how it changed (I'm assuming Dell's OMSA SNMP picked up a different number for the disk upon reboot), but removing and re-adding the sensor for the same disk allowed PRTG to look for the updated serial number.
Removing and re-adding the sensor works for me as well, but only until the next reboot, at which point it will break again. What seems to be happening in my case is that the last digit of the serial gets cut of on one reboot and re-added on the next reboot, then cut off again, and so on. Is there any way I can make the PRTG sensor more flexible by either using a wildcard for the serial number or by ignoring the serial number and just using the physical location/SAS port (which does not change between reboots)?
Unfortunately, as of now, there's no way of inserting a wildcard for the disk serial. The only workaround would be to pause the old sensor infinitely and create a new one once it fails :( Is there any update available for the target host ILO to prevent that from happening by any chance?
Stephan Linke, Tech Support Team
I have the latest iDRAC and RAID firmware installed for the affected PowerEdge R630s. The odd thing is that it's only happening on those servers that have SSD boot drives, but not those that have HDD boot drives. I would raise the issue with Dell but since their own software seems to handle it OK and the server is out of warranty, I doubt they'll listen :)
Most likely, yep :/ Perhaps an issue between the iDRAC and the RAID controller?
Stephan Linke, Tech Support Team
> their own software seems to handle it OK
I guess that their own software runs as a new instance when the server is rebooted, you can compare this to creating a new sensor instance which also works ;-)
Actually, by "their own software", I meant "OpenManage Essentials" running on a separate server. So it does not get restarted when these servers do, nor does iDRAC (I would need to cut power for that to happen). I'm guessing Dell does not use the serial number as an identifying marker on disks, but rather the "Connector - Enclosure - Disk" combo because I've ran into issues in the past where disks where replaced without following proper procedure and the "failed" flag would carry over to the new disk, even though the serial number had changed. Controller simply didn't care or wasn't smart enough to reset the flag.
No worries, I may just do what you suggested and rotate 2 sets of sensors :)