VMware Cloud Community
kmcd03
Contributor
Contributor

Object Format Health Warning after vSphere 7.0U3 Disk Format 15 upgrade

I upgraded our 16-node vSAN stretched cluster from 6.7 to 7.0U3.  After updating the Disk Format I am seeing the warning for vSAN object format health. I did find Cormac Hogan's blog and believe this is the cause of the warning:  https://cormachogan.com/2021/02/09/vsan-7-0u1-object-format-health-warning-after-disk-format-v13-u...

The check is showing 130 objects and 215 TB that need reformat. The problem is the hosts at our primary Fault Domains have <20% free space. We don't have a network overlay, like Geneve or OTV, so our stretched cluster is more active-passive.  So the Catch-22 here is there might not be enough slack space to change the format objects >255 GB, so can't upgrade objects to get new option to lessen slack space requirements. 

This cluster is scheduled to be decommissioned and VMs migrated to new VCF cluster in next 90 days. Is there any harm in ignoring the warning?  

Or does anyone know if there are safeguards to prevent the change object format task from using all the free space?  Will this task check for enough slack space to run?  Is it smart enough that it will only convert a few objects at a time and queue up the other objects?

Thanks.

0 Kudos
3 Replies
TheBobkin
Champion
Champion

@kmcd03 

"This cluster is scheduled to be decommissioned and VMs migrated to new VCF cluster in next 90 days. Is there any harm in ignoring the warning?"
Nope, no harm - the only thing not doing this will mean is objects will remain using older format and won't be able to avail of the new layout feature which permits using way less slack space while doing deep-reconfiguration of objects (e.g. when doing things like changing from Stripe-Width=1 to Stripe-Width=10).

 

"Or does anyone know if there are safeguards to prevent the change object format task from using all the free space?"
Yes, there are safeguards in place, it will pause resync of data to any disk that reaches 95% full (and at that point CLOM would be aggressively trying to move data off such a disk to a lower used disk), the relayout task can also of course just be cancelled.

 

"Will this task check for enough slack space to run?"
Well, not total but it does calculation before starting resync of each object and if it can't fit it then it won't start that object.

 

"Is it smart enough that it will only convert a few objects at a time and queue up the other objects?"
Yes, it does them few at a time - the only time this ever really has issues is in situations where there are very few nodes and disproportionally large objects (e.g. a 2-node cluster with 10TB per node and a single object using 6TB per replica).

paylkhan
Contributor
Contributor


@kmcd03 wrote:

I upgraded our 16-node vSAN stretched cluster from 6.7 to 7.0U3.  After updating the Disk Format I am seeing the warning for vSAN object format health. I did find Cormac Hogan's blog and believe this is the cause of the warning:  https://cormachogan.com/2021/02/09/vsan-7-0u1-object-format-health-warning-after-disk-formatfabricprice in Pakistan check is showing 130 objects and 215 TB that need reformat. The problem is the hosts at our primary Fault Domains have <20% free space. We don't have a network overlay, like Geneve or OTV, so our stretched cluster is more active-passive.  So the Catch-22 here is there might not be enough slack space to change the format objects >255 GB, so can't upgrade objects to get new option to lessen slack space requirements. 

This cluster is scheduled to be decommissioned and VMs migrated to new VCF cluster in next 90 days. Is there any harm in ignoring the warning?  

Or does anyone know if there are safeguards to prevent the change object format task from using all the free space?  Will this task check for enough slack space to run?  Is it smart enough that it will only convert a few objects at a time and queue up the other objects?

Thanks.


I think all will be fine as I also have the same condition and have no issue till date.

0 Kudos
KyleZero
Contributor
Contributor

Apologies for reviving this old topic, but I am hoping you can elaborate on this-

"the relayout task can also of course just be cancelled."

We are in a situation where we the "vSAN object format health" warning is present post upgrade from 6.7 to 7+. There are many large objects that needs to be fixed and unfortunately many of them require high performance (database disks). VMware support is telling me I should start it on a Friday and hope it doesn't have a great impact on production come Monday when it will undoubtedly still be running. Our clients aren't going to accept that "we hope it's done soon". It's seems like a major oversight that VMware didn't allow this to be run at scheduled intervals by object, but that is beside the point. 

If we have the option via the gui to stop/cancel the format change task it's much less of a concern. However, VMware Support has told me there is no option to stop it once you kick it off. I am already skeptical of some of this Support persons responses, so I am wondering if he is just wrong? 

0 Kudos