Hi
I am using vSAN 2node cluster in my lab
My Env info are such as follow :
vCenter = 7.0.2 18356314 hostname : vcsa.lab.vm
esxi1 = VMware ESXi, 7.0.2, 18426014 hostname : lhost1.lab.vm
esxi2 = VMware ESXi, 7.0.2, 18426014 hostname : lhost2.lab.vm
Witness = VMware ESXi, 7.0.2, 18426014 hostname : witness.lab.vm
I have met strange issues that I have attached SAN cluster partition , vSAN: Basic (unicast) connectivity check , vSAN: MTU check (ping with large packet size)
I did follow troubleshoots :
[root@lhost1:~] esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2021-10-09T06:20:56Z
Local Node UUID: 61509580-fbad-f512-bff5-005056ae7d16
Local Node Type: NORMAL
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 61509580-fbad-f512-bff5-005056ae7d16
Sub-Cluster Backup UUID: 61509659-bf16-cb81-6819-005056ae0f12
Sub-Cluster UUID: 523398d1-3e4b-1404-c339-d534ab9cfde7
Sub-Cluster Membership Entry Revision: 1
Sub-Cluster Member Count: 2
Sub-Cluster Member UUIDs: 61509580-fbad-f512-bff5-005056ae7d16, 61509659-bf16-cb81-6819-005056ae0f12
Sub-Cluster Member HostNames: lhost1, lhost2
Sub-Cluster Membership UUID: ec2a6161-fa6c-5d67-2a02-005056ae7d16
Unicast Mode Enabled: true
Maintenance Mode State: OFF
Config Generation: f8115b35-cdc4-4e93-ab41-6f436b8b4773 3 2021-10-09T05:44:52.71
Mode: REGULAR
As it shows we can see both members .
lhost1.lab.vm include two vmkernels . vmk0 (10.20.30.3) for management , vmotion and witness vmk1(10.20.40.2) : for vSAN traffic
lhost2.lab.vm include two vmkernels . vmk0 (10.20.30.4) for management , vmotion and witness vmk1(10.20.40.3) : for vSAN traffic
witness.lab.vm include two vmkernels vmk0 (10.20.30.11)for managemnt and vmk1(10.20.40.4) for vSAN traffic
also I can ping from vSAN vmkernel all other vSAN vmkernels
From lhost1 :
From lhost2
From witness host
Also as this is nested all MTU are 1500
As you can see all connections are ok so why it shows cluster partitions or unicast connectivity as I read this issue was in previous version of vSAN and solved in vSAN7
@baber As ICMP ping looks to be working okay, it would be advisable to validate there is communication between the data-nodes and Witness on the port required for cluster membership (UDP 12321) by confirming you see packets leaving both data-nodes (as they are currently Master and Backup role) and reaching the Witness:
On the data-nodes:
# tcpdump-uw -i vmk0 -n -s0 -t udp port 12321
On the Witness:
# tcpdump-uw -i vmk1 -n -s0 -t udp port 12321
If you see packets leaving the Master and Backup but not reaching the Witness (or only from one IP, not both) then you have something blocking the traffic in between them (e.g. Firewall, Proxy, network-optimisation device etc.).
Hold on, had a second look at your screenshots - you are testing out vmk1 on the data-nodes to the Witness but this isn't what you said is configured for Witness traffic - validate you can reach the vsan-enabled Witness vmk IP via vmk0 on the data-nodes, if you can't then that is the problem and you should either figure that out OR configure vmk0 on the Witness for vsan-traffic OR just remove the WTS and have all vsan-network use only vmk1 on all nodes.