We are testing a cluster consisting of three esxi 7u2 hosts using a standard switch and iSCSI storage. The vmk ports for iSCSI are also tagged for use with vmotion. This has been a standard in our environments with little issues.
vMotion was not working on the little cluster so we verified all configs match the load balancing type, nic failover settings across all three hosts. Everything matched as it should. Kept getting vmtion failure errors due to 'destination not receiving data from source' and acts like it's not communicating. Logged into the hosts and vmkping response from and to all hosts are good. Scratching head...
So we reconfigured by removing vmotion from the iSCSI vmk adapter and added vmotion to vmk3 was to be used for replication. Restarted all 3 hosts and still no dice. Hmm... so thinking for grins and giggles....removed the VMK3 port from the hosts and restarted services then added vmotion back to iSCSI vmk adapter and it's now working on 2 of the 3 hosts.
Not sure why the 3rd host is not working. I've remoted into the problem host and ran esxcli network ip interface list and it only shows the vmk0 > vmk2..... is there a ghost of VMK3 somewhere in the system that is borking vmotion and I'm just not finding it?
Thanks
Yes, they are bound on that vmk. While I do prefer to have a dedicated vmk just for Vmotion we currently do not have enough nics so they are being shared.
That being said, I found the reason why it did not work and what I did as a workaround to get this resolved.
Configuration as shows vSphere
iSCSI-1
vmk1 on vmnic4 active, vmnic5 unused
ISCSI-2
vmk2 on vmnic5 active, vmnic4 unused
vmic 4 =
IP address 172.x.x.100
vmnic 5 =
IP address 172.x.x.200
all hosts should respond vmkping per respective vmk ports
vmk1 should correctly respond as x.100 which correlates to vmnic4
vmk2 should correctly respond as x.200 which correlates to vmnic5
Both Host A and Host C respond as configured and vmotion works between them.
Host B response
vmk1 comes back as x.200 vmnic4.
vmk2 comes back as x.100 vmnic5.
The quick fix for me:
Host B,
change vmk1 to use vmnic5 active, vmnic4 unused
change vmk2 to use vmnic4 active, vmnic5 unused
and now vmotion is working between all three hosts...
Clay
Just guessing... do you have check your MTU on vSwitch and VMK level?
Try
vmkping -d -s 8972 <OtherIP> -I vmkxx
and
vmkping <OtherIP> -I vmkxx
If only the second is working than you have a MTU problem aka MTU is missing somewhere.
Regards,
Joerg
Hi Joerg,
Both methods return positive response and verified the MTU is set to jumbo frames- 9000 to match the MTU of the other 2 hosts.
Thank you!
Clay
Hmm.
We are a iSCSI shop as well but i never configured vMotion on a iSCSI VMk. Sometimes we have vMotion enabled on the Management VMK but most of time its dedicated VMK(VLANID, Jumbo, ...) one. But often they share the same VMNICS because of only having 2x25G.
Is your iSCSI VMK one which is a BINDED one?
Regards,
Joerg
Yes, they are bound on that vmk. While I do prefer to have a dedicated vmk just for Vmotion we currently do not have enough nics so they are being shared.
That being said, I found the reason why it did not work and what I did as a workaround to get this resolved.
Configuration as shows vSphere
iSCSI-1
vmk1 on vmnic4 active, vmnic5 unused
ISCSI-2
vmk2 on vmnic5 active, vmnic4 unused
vmic 4 =
IP address 172.x.x.100
vmnic 5 =
IP address 172.x.x.200
all hosts should respond vmkping per respective vmk ports
vmk1 should correctly respond as x.100 which correlates to vmnic4
vmk2 should correctly respond as x.200 which correlates to vmnic5
Both Host A and Host C respond as configured and vmotion works between them.
Host B response
vmk1 comes back as x.200 vmnic4.
vmk2 comes back as x.100 vmnic5.
The quick fix for me:
Host B,
change vmk1 to use vmnic5 active, vmnic4 unused
change vmk2 to use vmnic4 active, vmnic5 unused
and now vmotion is working between all three hosts...
Clay