NTP time sync appears to have broken after the 7U3 upgrade. Anyone else run into this and have suggestions on how I might fix? I've deleted and re-created the service, checked FW policy, tried different servers... Nothing helps.
Original Build: VMware ESXi, 7.0.2, 17867351
New Build: VMware ESXi, 7.0.3, 18644231
I'm a product manager in the vSphere team and monitoring this thread for a while. Unfortunately, we are unable to reproduce some of the issues you have reported internally. Would you be willing to share some of these issues and support bundles from your environment so that we can investigate this further?
This is an excerpt from one of my emails with support:
Although, this has been one of the known ones with 7.0 U3, you can definitely apply the workaround steps as shown in the article: https://kb.vmware.com/s/article/86255?lang=en_US
Though i'm not certain my issue is the same as what has been described in some of the above posts.
VMware ESXi, 7.0.3, 19193900
Good afternoon, I can only talk about my personal experience and environment,
Introduced with ESXi 7.0U3 and before ESXi 7.0U3c if for any reason any reliable time was not reachable, the management agent on the hosts crashed in less then a minute; I mitigated that issue by adding a local, reliable, time source, and referencing it by it's IP address and not by FQDN . it was not a big effort as my lab is a small one with only six ESXi host.
Disabling the monitoring of "time sync" related event also mitigated this issue, but with a permament warning in HOST > Configuration > time configuration stating: the "time service is currently not syncronized" even when the NTP service was working as expected and time source were available over internet.
Somehow ESXi 7.0U3c fixed that specific issue but introduced other.
With vCenter 7.0U3c going to HOST > Configuration > time configuration is more or less futile, as no alarm / warning are generated / tracked when time sycronization went lost, e.g. when I stopped the service via command line and after some time I restarted it (/etc/init.d/ntpd stop, start, restart). So, to be properly notified of potential issues in a timely manner I decided to rely on modern log facility / monitoring tools.
TBH, I have not seen the NTP service restart by itself but I'm of the "old school" and so I tend to make things as simple as possible, and I suppose I was also more lucky than other. To be noted, in my lab I don't rely on AD, the hosts or any kind of virtual machine as a (reliable) time source.
Regards,
Ferdinando
Thank you for your feedback. I've shared it with our engineering team.
Any workaround?
VMware has now implemented another partial-fix for NTP directly into the new ESXi 7.0 Update-3d.
Short snip from 7.0u3d release notes:
In some environments, after upgrading to ESXi 7.0 Update 2d and later, in the vSphere Client you might see the error Host has lost time synchronization. However, the alarm might not indicate an actual issue.
This issue is resolved in this release. [70u3d] The fix replaces the error message with a log function for backtracing and prevents false alarms.
Unfortunately, this does NOT fix my issues described earlier in the thread from other NTP issues still persisting, specifically after CLEAN-INSTALL on SuperMicro SuperServer mini-ATX with Xeon D-Series (we use many in clusters for special-use tasks).
My ESXi hosts are in Hypervisor:VMware ESXi, 7.0.3, 19482537 (ESXi 7.0 Update-3d) but still I am seeing this NTP alert on all ESXi hosts.
@sramanuja - It is now reproduced by multiple people, that NTP issues persist and still unresolved with the latest ESXi 7.03d build patch.
Please relay that to your engineering team.
I see that too, but we are not seeing such issues being reported in our official support channel. Unless someone raises and SR and uploads logs for us to investigate and provides me the SR number, I cannot help.
I cannot create SRs because they are supposed to be unique to our customers. Multiple SRs can help raise the visibility, so the best thing you can do is to raise SR and upload logs.
@sramanuja wrote:I see that too, but we are not seeing such issues being reported in our official support channel. Unless someone raises and SR and uploads logs for us to investigate and provides me the SR number, I cannot help.
I cannot create SRs because they are supposed to be unique to our customers. Multiple SRs can help raise the visibility, so the best thing you can do is to raise SR and upload logs.
Thanks for replying, and I'm very glad you are able to reproduce the issue!!
Unfortunately, I'm very surprised you are unable to open your own internal SR for the issue you reproduced.
Please private message me a courtesy or internal SR# and method to upload logs to that SR#, and I will gladly do so!
I cannot justify paying for burning the cost of an SR for a known-issue that as you said, even though you reproduced it will likely get no attention without multiple customers opening multiple SR's for escalation to engineering for a patch. For such a critical service as NTP, this is not a good answer from VMware.
Regardless, please provide any no-cost SR method so I can assist, and I will.
Thanks!
@sramanuja, good morning,
I beg your pardon but I have to agree with Labmasterbeta (and many other).
I have now updated the vCenter product to the current 7.0U3e and ESXi to the current ESXi 7.0U3D.
Nonetheless, as far as I have taken the trouble to verify, all the defects in the user interface or the inaccuracy of some of the information displayed are as they were before.
I mean, if I have doubts about the correct functioning of the NTP service I use the command line, exactly as I did before.
Let me tell you that personally I don't understand why to keep objects in the context of a user interface that have not worked for a long time, like the "famous" action button.
I am not here (and I do not want) to argue but, I should rise a support request to report that the power consumption indicated by the monitoring of a virtual machine cannot amount "some" thousand kilowatts? Honestly, I don't think so.
From my humble perspective, many don't bother anymore asking for technical support for long-standing (and maybe obvious) "unresolved product problems and defects", in the end they manage in another way or ignore them.
Well, this does not mean that I will no longer use VMware products (I don't even think about it) but certainly now I find myself being more cautious and selective than before.
Regards,
Ferdinando
same here, a stretched vSAN environment running on build 19482537 (DellEMC VxRail Image).
Hope there will be a final fix for that.
I am seeing similar issues. My versions are below. Last week I had one host lost its management interface and after a while it got reconnected back again. This a new build environment but we are close to moving to production. Right now we are on 6.5. I am not confident with this version as we have a new CIO and don't want to get yelled at LOL!!
I am getting this message but the time is correct most of the time with the hosts. A few hosts would randomly be off which is scary.
Time service is currently not synchronized.
VMware vCenter Server
Version:
7.0.3.00600
Build number:
19717403
ESXi version:7.0.3
ESXi build number:
19482537
VMware ESXi, 7.0.3, 19898904
Same issue on the latest version. Below is the workaround given for my case, but the issue comes back after a reboot.
1. On vSphere Client, go to Configure -> System -> Time Configuration tab, select "Network Time Protocol" and click on EDIT button
2. From the configuration box, uncheck "Enable monitoring events"
3. Click the OK button
sramanuja Can you please check? There should be a PR open.
vSphere 7 = Garbage. 6.7 support needs to be extended another 12 mo. at least.
Regards,
Adam Tyler
TPGOPI007,
If you did the work around (Unchecking "Enable monitoring events"), any events will not be logged?
Not logged until the next reboot
Thanks!
VMware has released new ESXi Version.
ESXi 7.0 Update 3e | ESXi_7.0.3-0.40.19898904 | 2022-06-14 | 19898904 |
Someone checked if Issue solved finally or not?