Re: No standby possibe after U3?

Raudi · ‎10-06-2021

In my homelab i have a Supermicro system and normal i'm running with one host. For updates i switch on a second host, which is the most time in standby.

Now after update from the latest U2 to U3 i can't set the host to standby. I got a error that this isn't supported on this object.

When i try to modify the energy settings, i got a error that one value wasn't correct: ipmiinfo

In the past i had no problems, something mit be changed here.

Is there something i can try or modify to get this working again?

Raudi · ‎10-06-2021

I made some research.

From the vpxd.log: (made the lines a little bit shorter)

error vpxd[07535] [Originator@6876 sub=Default] [IpmiLib:ipmi_packet_unpack] IPMI command failed 0xCC
error vpxd[07535] [Originator@6876 sub=Default] [IpmiLib:ipmilan_v2_command] ipmi_command returned 18
error vpxd[07535] [Originator@6876 sub=Default] [IpmiLib:ipmi_packet_unpack] IPMI command failed 0xCC
error vpxd[07535] [Originator@6876 sub=Default] [IpmiLib:ipmilan_v2_command] ipmi_command returned 18
error vpxd[07535] [Originator@6876 sub=Default] [IpmiLib:ipmi_packet_unpack] IPMI command failed 0xCC
error vpxd[07535] [Originator@6876 sub=Default] [IpmiLib:ipmilan_v2_command] ipmi_command returned 18
error vpxd[07535] [Originator@6876 sub=MoHost] IPMI lib: GetMacAddr call failed; error: 34 ('FAILED')
error vpxd[07535] [Originator@6876 sub=Default] ribcl_ilo3:Bad HTTP response (404)
error vpxd[07535] [Originator@6876 sub=Default] hpilo_pm: ILO MAC string not found
error vpxd[07535] [Originator@6876 sub=MoHost] ILO lib: GetMacAddr call failed; error: 128 ('MISMATCHED_BMC_MAC_ADDR')

But why?

I configured:

And when i query the infos on the host:

[root@vmsrv02:~] esxcli hardware ipmi bmc get
BMCFirmware Version: 6.71
Hostname Reported:
IPMIVersion: 2.0
IPv4Address: 192.168.88.142
IPv4Gateway: 192.168.88.250
IPv4Subnet: 255.255.255.0
IPv6Addresses:
LANif Admin Status: true
MACAddress: ac:1f:6b:37:f1:41
Manufacturer: Super Micro Computer Inc.
OSName Reported:
[root@vmsrv02:~]

And i think this post must be moved to vCenter Server, because the vCenter talks with the IPMI interface, not the Hypervisor...

Raudi · ‎10-23-2021

I have a contacted the support...

NetApp closed the ticket without helping me, they wrote that vCenter 7.0.3 isn't listed in their IMT with a H410C host. But they told me not where to find that info, i found only that the H410C supports ESXi 7.0.2, no word regarding the vCenter version.

I searched also in the VMware compatibility matrix for informations regarding compatibility between the host hardware and the vCenter, but here i also don't found infos.

VMware is helpig me a little bit but told me that here is the correct way to open a ticket at NetApp and then NetApp must contact VMware internal.

Supermicro, who build the NetApp H410C, is a little bit helping, but can't guarantee, because it is possible that NetApp uses modified OEM firmwares...

Not very easy.

But i found something, i got a info from VMware how to enable debug logging for the IPMI communication. And here i found something interesting.

I installed a VCSA 7.0.2 and made a debug log of a successful configuration of the power management. Then i put the host back to the VCSA 7.0.3 and made the same again.

The VCSA 7.0.3 establish a session, the the BMC and VCSA exchange some informations. But then, when sending one command, i found a difference.

This was the working part:
[IpmiLib:ipmi_cmd_lan] (seq=0 cmd c02 netfn 0c lun 00 sa 20) cmd length=4
[IpmiLib:dump_buf] dumping buffer for: ipmi_cmd_lan: cmd data (len=4)
[IpmiLib:dump_buf] (0000) 01 05 00 00
[IpmiLib:ipmi_cmd_lan] start IPMI 2.0 command:c02

The not working part is a little bit different in one line:
[IpmiLib:ipmi_cmd_lan] (seq=0 cmd c02 netfn 0c lun 00 sa 20) cmd length=4
[IpmiLib:dump_buf] dumping buffer for: ipmi_cmd_lan: cmd data (len=4)
[IpmiLib:dump_buf] (0000) 0e 05 00 00
[IpmiLib:ipmi_cmd_lan] start IPMI 2.0 command:c02

The IPMI command is the same, but the VCSA 7.0.3 is decoding it different than the VCSA 7.0.2. In the first working command, the VCSA is getting the MAC address back, and when using the wrong command the VCSA is getting a error message. This will be tryed a few times and then the VCSA trys a different protocol: HP iLO

To me it looks like a bug in the VCSA 7.0.3, so i'm curious for the answer of the VMware support regarding my findings...

And as i wrote in my last answer, this must be moved to the vCenter discussions.

a_p_ · ‎10-23-2021

Discussion moved to vCenter™ Server Discussions

Raudi · ‎12-04-2021

O.k. i have the confirmation from the support that this is a bug caused by a code change.

And i got a test VCSA where they had fixed the problem.

FritzEDV · ‎04-11-2022

Any news on this one?

We and all of our customers are affected by this issue. Supermicro servers in general ranging from X9 to X11 series mainboards.

Still happening on current VCAPP (7.0.3.005000)

Raudi · ‎04-11-2022

This issue is scheduled to be fixed in the next vCenter Update, perhaps U3e.

I made a case for this and got already a fix for it.

vbondzio · ‎04-12-2022

~~@Raudi~~

~~Would you mind DMing me the SR# you reported this in? Thanks!~~

edit: never mind, found it.

This is indeed targeted to be resolved in an upcoming patch release, please note that VMware can't comment on a timeline and all fixes are tentative until GA.

Raudi · ‎05-13-2022

Just installed the 7.0.3.00600 and here the problem still isn't fixed... I hope the next patch will include the fix...

Raudi · ‎07-12-2022

Yesterday i installed 7.0 U3f - 7.0.3.00700 - and now it works again!

Problem solved!

nesavibes · ‎12-15-2022

How do you use DPM, do you manually put the server in standby or vcenter does it automatically? can you let all thats needed to get it to work automatically, i have 2 dell servers. I have tried myself and can't get it to work.

nesavibes · ‎12-15-2022

Version:7.0.3
Build:20845200

Raudi · ‎12-15-2022

I do this every time manually.

I switch the 2nd host on only when i install updates. All the other time the host is off to save power.

And perhaps in a 2 node cluster that don't work because of VMware HA to have the failover resources?

nesavibes · ‎12-15-2022

i see....i was trying it with 2 nodes to get it to put all the VMs on one server and shut the other off to save power. the stuff i read seem to say i need a minimum of 2 servers. i will try to get a third server added and see then. there is a DRS cluster feature where i can tell DRS to keep all the VMs together on one server. So if migration is going to happen, all would move at the same time. So all the time i have the VMs on one server and i thought the other server would go into standby having no VMs running on it. Will you ever test it this way?

Raudi · ‎12-15-2022

As i wrote, do you have VMware HA enabled? If yes disable this for a test, because shutting down one host in a 2 node HA cluster hasn't enoug resources in case of a failure.

nesavibes · ‎12-15-2022

it worked when i turned off HA on the cluster....so with 2 nodes DPM will works automatically by turning off the idle server. I will get a 3rd server and see if it will work alongside HA. Thanks.

All

No standby possible after U3?