As I check the node->monitor->triggered alarms, I see "status of other host hardware objects" in a critical state. When I click on it, it goes to the summary page and nothing weird is there. As you can see in the picture, the red circle is in the node's icon but there is no bad message in the summary page.
How can I find out more?
I think it's an issue related to the server firmware. it's not bad to update the BMC firmware. Check the following link:
https://kb.vmware.com/s/article/2001933
Hi mahmn,
It looks like that is an old alarm warning that has been acknowledged but not reset to green.
Select the item and click the “Reset to Green” option which will then become available. This will remove the alarm and set the object to healthy.
Kind regards.
The time of trigger is a little old, reset it to green and if it happened again, check the current logs and health status of this physical server via vendor OOB/LOM solution. (Like iLO for HP Proliant server)
OK. I reset that to green and some hours ago, I saw that again. So, I checked the events at the time the alarm was triggered and now I see this record:
Hardware Sensor Status: Processor green, Memory green, Fan green, Voltage green, Temperature green, Power green, System Board green, Battery green, Storage green, Other red
How can I check more on what is that "Other"?
Are you sure in that duration there wasn't any specific generated logs contain a warning or fault?!
Please check the following log files, maybe you can find any related triggered issue ...
/var/log/syslog.log , vmkernel.log , vmksummary.log
I did another check. At the time that this alarm is triggered:
01/05/2021, 8:49:39 PM Hardware Sensor Status: Processor green, Memory green, Fan green, Voltage green, Temperature green, Power green, System Board green, Battery green, Storage green, Other red
I checked the syslog on ESXi host in a time duration from 8 PM to 9 PM. I guess the object is BMC which is not critical as I think.
2021-01-05T20:30:00Z crond[2098816]: USER root pid 2269536 cmd /bin/hostd-probe.sh ++group=host/vim/vmvisor/hostd-probe/stats/sh
2021-01-05T20:31:37Z sensord[2269535]: ipmi_completion: expected at least one byte/completion code to be returned, retries left: 10
2021-01-05T20:31:57Z sensord[2269535]: recv_reply: bmc timeout after 20000 millisconds
2021-01-05T20:31:57Z sensord[2269535]: ipmi_completion: no reply, failed to communicate with bmc
2021-01-05T20:35:00Z crond[2098816]: USER root pid 2269550 cmd /bin/hostd-probe.sh ++group=host/vim/vmvisor/hostd-probe/stats/sh
2021-01-05T20:40:00Z crond[2098816]: USER root pid 2269564 cmd /bin/hostd-probe.sh ++group=host/vim/vmvisor/hostd-probe/stats/sh
2021-01-05T20:45:00Z crond[2098816]: USER root pid 2269578 cmd /bin/hostd-probe.sh ++group=host/vim/vmvisor/hostd-probe/stats/sh
2021-01-05T20:50:00Z crond[2098816]: USER root pid 2269592 cmd /bin/hostd-probe.sh ++group=host/vim/vmvisor/hostd-probe/stats/sh
2021-01-05T20:51:29Z sensord[2269535]: ipmi_completion: expected at least one byte/completion code to be returned, retries left: 10
2021-01-05T20:51:49Z sensord[2269535]: recv_reply: bmc timeout after 20000 millisconds
2021-01-05T20:51:49Z sensord[2269535]: ipmi_completion: no reply, failed to communicate with bmc
2021-01-05T20:51:49Z sensord[2269535]: Warning: Unexpected error Failure
2021-01-05T20:51:49Z watchdog-sensord: '/usr/lib/vmware/bin/sensord -l' exited after 1317 seconds 1
2021-01-05T20:51:49Z watchdog-sensord: Executing '/usr/lib/vmware/bin/sensord -l'
2021-01-05T20:51:49Z sensord[2269609]: ipmi_completion: IPMI completion code for msgid: 5, cc = 0x80
2021-01-05T20:55:00Z crond[2098816]: USER root pid 2269613 cmd /bin/hostd-probe.sh ++group=host/vim/vmvisor/hostd-probe/stats/sh
2021-01-05T20:59:40Z sensord[2269609]: ipmi_completion: expected at least one byte/completion code to be returned, retries left: 10
2021-01-05T21:00:00Z sensord[2269609]: recv_reply: bmc timeout after 20000 millisconds
2021-01-05T21:00:00Z sensord[2269609]: ipmi_completion: no reply, failed to communicate with bmc
I think it's an issue related to the server firmware. it's not bad to update the BMC firmware. Check the following link:
https://kb.vmware.com/s/article/2001933
I had a similar problem. This message appeared right after installation of new memlry modules. So Intrusion sensor became red. After I reseted sensor the problem disappeared.