Hi All,,
We do have situations where we have a cluster that almost fully utilized and we just bought new hardware with a newer CPU family.
We want to have them in the same old cluster (Mixed Hardware) for nsx-v and other limitations of vRA. We can enable EVC and that will solve the issue but our customers won't accept all VMs down. We thought of using affinity rules to create some rules that will separate VMs between old CPU and new CPU hosts. But the thing is I do not know if we can trust affinity rules with this plus we have a huge number of VMs. almost 8000 and growing rapidly. What do you think guys.? is this a bad idea. Do you see any other issue that I might did not think of?
Note. We are using nsx-v as an underlay network and vRA as a govermance and automation portal.
but if you have a cluster with only "old hosts" now, you should be able to set EVC to the current CPUs used. Then when you add the new hosts, the new instructions will be masked, and you shouldn't need to power off the VMs? I have not done this in ages though, so you may want to test it as I may be incorrect here 😄
finally, I managed to enable EVC on the cluster level. It turns out that we have a mix of EVC enabled on the VM level. EVC on VMs level needs to be either disabled on all VMs or enabled on all VMs. Mixing it will prevent EVC on the cluster level.
Yes, that is a bad idea. It will only make things extremely complicated for DRS to schedule. It will need to take all the rules, and combinations, into account.
but if you have a cluster with only "old hosts" now, you should be able to set EVC to the current CPUs used. Then when you add the new hosts, the new instructions will be masked, and you shouldn't need to power off the VMs? I have not done this in ages though, so you may want to test it as I may be incorrect here 😄
I do agree with you that AAR is not a good idea. Regarding EVC I do not believe you can enable it without powering off all your VMs.
I think the only way to enable EVC without VM shutdown is if we go with lower than the baseline that our VMs are running on now.
It's the same for me. In my Lab, we do have Intel Skylake (Cisco blade servers) and it's working just fine. But when I try this in our production which has the same CPU (Skylake). The only difference in production is that we have Rack servers.
finally, I managed to enable EVC on the cluster level. It turns out that we have a mix of EVC enabled on the VM level. EVC on VMs level needs to be either disabled on all VMs or enabled on all VMs. Mixing it will prevent EVC on the cluster level.
Ah, yes that would be problematic indeed. good to hear you solved the problem!