VMware Cloud Community
baber
Expert
Expert

vSAN 2 node cluster and partition problem

Hi

I am using vSAN 2node cluster in my lab

My Env info are such as follow :
vCenter = 7.0.2 18356314                              hostname : vcsa.lab.vm
esxi1 = VMware ESXi, 7.0.2, 18426014         hostname : lhost1.lab.vm
esxi2 = VMware ESXi, 7.0.2, 18426014        hostname : lhost2.lab.vm
Witness = VMware ESXi, 7.0.2, 18426014   hostname : witness.lab.vm

I have met strange issues that I have attached SAN cluster partition , vSAN: Basic (unicast) connectivity check , vSAN: MTU check (ping with large packet size)

I did follow troubleshoots :

[root@lhost1:~] esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2021-10-09T06:20:56Z
Local Node UUID: 61509580-fbad-f512-bff5-005056ae7d16
Local Node Type: NORMAL
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 61509580-fbad-f512-bff5-005056ae7d16
Sub-Cluster Backup UUID: 61509659-bf16-cb81-6819-005056ae0f12
Sub-Cluster UUID: 523398d1-3e4b-1404-c339-d534ab9cfde7
Sub-Cluster Membership Entry Revision: 1
Sub-Cluster Member Count: 2
Sub-Cluster Member UUIDs: 61509580-fbad-f512-bff5-005056ae7d16, 61509659-bf16-cb81-6819-005056ae0f12
Sub-Cluster Member HostNames: lhost1, lhost2
Sub-Cluster Membership UUID: ec2a6161-fa6c-5d67-2a02-005056ae7d16
Unicast Mode Enabled: true
Maintenance Mode State: OFF
Config Generation: f8115b35-cdc4-4e93-ab41-6f436b8b4773 3 2021-10-09T05:44:52.71
Mode: REGULAR

As it shows we can see both members .

lhost1.lab.vm include two vmkernels . vmk0 (10.20.30.3) for management , vmotion and witness vmk1(10.20.40.2) : for vSAN traffic
lhost2.lab.vm include two vmkernels . vmk0 (10.20.30.4) for management , vmotion and witness vmk1(10.20.40.3) : for vSAN traffic
witness.lab.vm include two vmkernels vmk0 (10.20.30.11)for managemnt and vmk1(10.20.40.4) for vSAN traffic


also I can ping from vSAN vmkernel all other vSAN vmkernels

From lhost1 :

lhost1.jpg

 

 

 

 

 

 

 

 

 

From lhost2

lhost2.jpg

 

From witness host

witness.jpg

Also as this is nested all  MTU are 1500

As you can see all connections are ok so why it shows cluster partitions or unicast connectivity as I read this issue was in previous version of vSAN and solved in vSAN7

 

 

Please mark helpful or correct if my answer resolved your issue.
0 Kudos
2 Replies
TheBobkin
Champion
Champion

@baber As ICMP ping looks to be working okay, it would be advisable to validate there is communication between the data-nodes and Witness on the port required for cluster membership (UDP 12321) by confirming you see packets leaving both data-nodes (as they are currently Master and Backup role) and reaching the Witness:

 

On the data-nodes:

# tcpdump-uw -i vmk0 -n -s0 -t udp port 12321

On the Witness:

# tcpdump-uw -i vmk1 -n -s0 -t udp port 12321

 

If you see packets leaving the Master and Backup but not reaching the Witness (or only from one IP, not both) then you have something blocking the traffic in between them (e.g. Firewall, Proxy, network-optimisation device etc.).

0 Kudos
TheBobkin
Champion
Champion

Hold on, had a second look at your screenshots - you are testing out vmk1 on the data-nodes to the Witness but this isn't what you said is configured for Witness traffic - validate you can reach the vsan-enabled Witness vmk IP via vmk0 on the data-nodes, if you can't then that is the problem and you should either figure that out OR configure vmk0 on the Witness for vsan-traffic OR just remove the WTS and have all vsan-network use only vmk1 on all nodes.

Tags (1)
0 Kudos