I am mid way through upgading to ESX 3 at the moment, so I have some machines on 3.0.2 and some on 2.1.3. All ESX servers are set up to sync via NTP and all show the correct time using either date or hwclock --show.
All VMs sync via vmware-tools. Windows time service is disabled. We have a mix of W2k3, W2k, and NT4 VMs.
The VMs still on the ESX 2 hosts all have correct time. The VMs on ESX 3 are all 3-4 minutes fast.
It seems as if the new vmware-tools time sync cannot correct a clock if it is fast, because if I manually go in and reset the time to a few minutes slow, then go into vmware-tools and retick the time sync option, the time is then corrected. The time is not corrected if I manually set the time fast.
I see a number of time-sync issues on this forum, but none seem to match this. Is anyone else having this issue and is there a fix?
Ok, but how can I prevent the VMs from gaining time?
I'm not running any Linux guest and Windows Time service is disabled. These guests were all running with accurate clocks on ESX 2 before they were migrated.
I am wondering whether the migration process could have caused the VMs clocks to be fast, especially as they're all fast by about the same amount. I have just built a new box on one of the ESX 3 hosts, so I'll monitor that and see if the clock stays true.
The VM I built yesterday is already about 5 seconds fast. I also set a few other VMs back to the correct time and they too are now 5 seconds fast.
I enabled the Timetracker Stats and here is a few line from the log:
Nov 30 10:30:29.067: vmx| TimeTrackerStats behind by 58569667 cycles (26657 us); running at 100%; 0 stops, 0 giveups
Nov 30 10:30:29.068: vmx| TimeTrackerStats CMOS-P 3840 ints, 64.00/sec, 64.00 avg, 64.00 req; 10917925 tot, 10917293 req; 1219 loprg, 5667 rtry
Nov 30 10:30:29.068: vmx| TimeTrackerStats timer0 1092 ints, 18.20/sec, 18.21 avg, 18.21 req; 3105896 tot, 3105716 req; 1 loprg, 2 rtry
Nov 30 10:30:29.068: vmx| TimeTrackerStats PIIX4PMTT 25 ints, 0.42/sec, 0.43 avg, 0.43 req; 72794 tot, 72790 req; 0 loprg, 0 rtry
Nov 30 10:31:29.068: vmx| TimeTrackerStats behind by -1799869 cycles (-819 us); running at 100%; 0 stops, 0 giveups
Nov 30 10:31:29.068: vmx| TimeTrackerStats CMOS-P 3842 ints, 64.03/sec, 64.00 avg, 64.00 req; 10921767 tot, 10921133 req; 1220 loprg, 5673 rtry
Nov 30 10:31:29.068: vmx| TimeTrackerStats timer0 1093 ints, 18.22/sec, 18.21 avg, 18.21 req; 3106989 tot, 3106808 req; 1 loprg, 2 rtry
Nov 30 10:31:29.068: vmx| TimeTrackerStats PIIX4PMTT 26 ints, 0.43/sec, 0.43 avg, 0.43 req; 72820 tot, 72815 req; 0 loprg, 0 rtry
Nov 30 10:32:29.068: vmx| TimeTrackerStats behind by 27398498 cycles (12470 us); running at 100%; 0 stops, 0 giveups
Nov 30 10:32:29.068: vmx| TimeTrackerStats CMOS-P 3839 ints, 63.98/sec, 64.00 avg, 64.00 req; 10925606 tot, 10924973 req; 1220 loprg, 5673 rtry
Nov 30 10:32:29.069: vmx| TimeTrackerStats timer0 1092 ints, 18.20/sec, 18.21 avg, 18.21 req; 3108081 tot, 3107900 req; 1 loprg, 2 rtry
Nov 30 10:32:29.069: vmx| TimeTrackerStats PIIX4PMTT 26 ints, 0.43/sec, 0.43 avg, 0.43 req; 72846 tot, 72841 req; 0 loprg, 0 rtry
Nov 30 10:33:29.069: vmx| TimeTrackerStats behind by 52577373 cycles (23930 us); running at 100%; 0 stops, 0 giveups
Nov 30 10:33:29.069: vmx| TimeTrackerStats CMOS-P 3840 ints, 64.00/sec, 64.00 avg, 64.00 req; 10929446 tot, 10928813 req; 1221 loprg, 5676 rtry
Nov 30 10:33:29.069: vmx| TimeTrackerStats timer0 1092 ints, 18.20/sec, 18.21 avg, 18.21 req; 3109173 tot, 3108993 req; 1 loprg, 2 rtry
Nov 30 10:33:29.070: vmx| TimeTrackerStats PIIX4PMTT 25 ints, 0.42/sec, 0.43 avg, 0.43 req; 72871 tot, 72867 req; 0 loprg, 0 rtry
Nov 30 10:34:29.070: vmx| TimeTrackerStats behind by 8343928 cycles (3797 us); running at 100%; 0 stops, 0 giveups
Nov 30 10:34:29.071: vmx| TimeTrackerStats CMOS-P 3841 ints, 64.02/sec, 64.00 avg, 64.00 req; 10933287 tot, 10932653 req; 1221 loprg, 5676 rtry
Nov 30 10:34:29.071: vmx| TimeTrackerStats timer0 1093 ints, 18.22/sec, 18.21 avg, 18.21 req; 3110266 tot, 3110085 req; 1 loprg, 2 rtry
Nov 30 10:34:29.071: vmx| TimeTrackerStats PIIX4PMTT 26 ints, 0.43/sec, 0.43 avg, 0.43 req; 72897 tot, 72892 req; 0 loprg, 0 rtry
Nov 30 10:35:29.071: vmx| TimeTrackerStats behind by 42059702 cycles (19143 us); running at 100%; 0 stops, 0 giveups
Nov 30 10:35:29.071: vmx| TimeTrackerStats CMOS-P 3840 ints, 64.00/sec, 64.00 avg, 64.00 req; 10937127 tot, 10936493 req; 1221 loprg, 5676 rtry
Nov 30 10:35:29.071: vmx| TimeTrackerStats timer0 1092 ints, 18.20/sec, 18.21 avg, 18.21 req; 3111358 tot, 3111178 req; 1 loprg, 2 rtry
Nov 30 10:35:29.071: vmx| TimeTrackerStats PIIX4PMTT 26 ints, 0.43/sec, 0.43 avg, 0.43 req; 72923 tot, 72918 req; 0 loprg, 0 rtry
Nov 30 10:36:29.071: vmx| TimeTrackerStats behind by 73308682 cycles (33365 us); running at 100%; 0 stops, 0 giveups
Nov 30 10:36:29.072: vmx| TimeTrackerStats CMOS-P 3839 ints, 63.98/sec, 64.00 avg, 64.00 req; 10940966 tot, 10940333 req; 1221 loprg, 5676 rtry
Nov 30 10:36:29.072: vmx| TimeTrackerStats timer0 1093 ints, 18.22/sec, 18.21 avg, 18.21 req; 3112451 tot, 3112270 req; 1 loprg, 2 rtry
Nov 30 10:36:29.072: vmx| TimeTrackerStats PIIX4PMTT 25 ints, 0.42/sec, 0.43 avg, 0.43 req; 72948 tot, 72943 req; 0 loprg, 0 rtry
As you can see, the VM think it's behind, even though it is nearly a minute fast.
May be a silly question, but did you check the time on the ESX host? The VMware tools compare the VM time to ESX host time. So if the ESX host time is too fast you can get the effect you described.
Yes, ESX hosts are showing correct time, both with date and with hwclock --show.
I have just found this patch which may fix the issue, I'm going to try it on one of the hosts.
The patch made no difference. VMs gained 15 seconds over the weekend.