VMware Cloud Community
abkaiser
Contributor
Contributor

ESXi poor performance and high CPU even when system isn't busy

I've got an IBM x3200 server, with:

3 GB RAM

Pentium Dual Core 1.8 gHz

IBM ServRAID 8s SAS PCIe with 3 15k drives (RAID 5)

Running Windows Server 2003

I recently converted the server installed on this machine to a VM. And the VM performance is terrible. Anything I do in that VM instance takes maybe ten times longer than normal. I get very high CPU utilization even if I'm just opening or closing windows.

I've verified there is a battery backup cache module installed on the RAID card.

I've verified the VM is set to use 1 virtual CPU.

Can anyone assist?

Thanks,

Andy

0 Kudos
19 Replies
SuryaVMware
Expert
Expert

My understanding from your description, This VM is migrated from a Physical Box. Correct me if i am wrong.

If it is converted to a VM from a Physical machine, You need to check the HAL inside the Guest OS and make sure it is set the ACPI UniProcessor HAL, not SMP HAL.

-Surya

0 Kudos
abkaiser
Contributor
Contributor

>This VM is migrated from a Physical Box. Correct me if i am wrong.

Your assumption is correct.

>If it is converted to a VM from a Physical machine,

>You need to check the HAL inside the Guest OS and

>make sure it is set the ACPI UniProcessor HAL, not SMP HAL.

In the Device Manager, I'm set to ACPI Multiprocessor. Am changing it now to UniProcessor, and will see what happens.

0 Kudos
abkaiser
Contributor
Contributor

Update:

Changing to a UniProcessor device did not fix the issue. I notice no change or improvement in performance. Can you think of anything else I can check?

Thanks,

Andy

0 Kudos
SuryaVMware
Expert
Expert

I am assuming that you did a restart of the VM after you have changed the HAL.

Again, What kind of applications are loaded in the GOS, how is the CPU utilization from inside the VM and from outside the VM?

Can you give me a screenshot of the Taskmanager with the 'Performace' tab selected and also the 'performance tab' from the VI client?

-Surya

0 Kudos
joshin
Contributor
Contributor

Did you remember to install vmware-tools?

-J

0 Kudos
abkaiser
Contributor
Contributor

Yes, VMWare tools are installed.

Andy

0 Kudos
abkaiser
Contributor
Contributor

Surya,

>I am assuming that you did a restart of the VM after you have changed the HAL.

Correct.

>Again, What kind of applications are loaded in the GOS, how is the CPU utilization from inside the VM and from outside the VM?

This is Windows Server 2003 SBS. It's running Exchange 2003, and also doing file sharing and print sharing. We have about 15 users. At the time the screen shots were taken, there was NO user activity - I was the only one on the system.

>Can you give me a screenshot of the Taskmanager with the 'Performace' tab selected and also the 'performance tab' from the VI client?

Attached.

Please let me know what you think.

Thanks!

Andy

0 Kudos
abkaiser
Contributor
Contributor

Let's try attaching those pictures again. Here they are (hopefully).

Andy

0 Kudos
abkaiser
Contributor
Contributor

I get an error message when I try to attach files. I don't know what the problem is. So I've uploaded the files to my personal webpage. For the time being, they're located here:

Hopefully you can grab those. What are your thoughts?

Thanks,

Andy

0 Kudos
Jackobli
Virtuoso
Virtuoso

What is the purpose of running this in ESXi? You committed (more or less) every byte of your RAM to the guest. So there is no ressources available to other guests.

At guest level, in Task Manager, what applications are consuming most CPU time?

Your disk subsystem is not really that performer. Might not be the clue here, but RAID5 out of three disks leaves not much IOPs for ESXi.

0 Kudos
abkaiser
Contributor
Contributor

> What is the purpose of running this in ESXi? You committed (more or less) every byte of your RAM to the guest. So there is no ressources available to other guests.

There are two reasons for doing this:

1) We had some server problems on the server's original hardware. In order to diagnose without affecting production, we virtualized the environment. One copy of the VM went temporarily on our test hardware for our testing. Another went on new hardware (not the hardware you see now), so we could run hardware diagnostics on the original production hardware. When we were done with our testing and fixes, we moved the production VM back to the production hardware. From day 1, though, we've had these weird performance issues.

2) Easier hardware upgrades in the future: When the client eventually decides to upgrade this server, the migration process will take just the time to move the VM to new hardware (a couple hours), versus 20-30 or more hours for OS install, config and data and services migration.

>At guest level, in Task Manager, what applications are consuming most CPU time?

It's not any one job. That's the weird part. It's anything that's running tends to take up more CPU than normal, and to take far longer to run than normal. So even running something simple (like opening Explorer or device manager!) pegs the CPU at 100% for 10-20 seconds for what should be a sub-second operation. It's like someone stole a bunch of memory, CPU and disk speed from the system. And I don't know how to get it back.

Andy

0 Kudos
CraigAlexander
Contributor
Contributor

Hi,

What are the specs of the ESXi host server.

Also, you may want to disable any IBM specific drivers or services that are on the converted VM.

Craig

0 Kudos
abkaiser
Contributor
Contributor

Hi Craig,

>What are the specs of the ESXi host server.

See my first post for this information.

>Also, you may want to disable any IBM specific drivers or services that are on the converted VM.

I checked the device manager and I see nothing appropriate - no IBM holdovers for RAID, disk, or CPU. However, I'm not very good at interpreting the System Devices section, so here's a screen shot of that - can you see anything I could remove?

http://www.andybrain.com/temp/pic3.bmp

Thanks,

Andy

0 Kudos
Jackobli
Virtuoso
Virtuoso

I checked the device manager and I see nothing appropriate - no IBM holdovers for RAID, disk, or CPU. However, I'm not very good at interpreting the System Devices section, so here's a screen shot of that - can you see anything I could remove?

There is a tape drive on the screenshot...

There have been reports, that removing "hidden devices" could make make ESXi slow. That are devices/drivers, that were present on the physical machine and now are absent. There drivers are still present, but "hidden". Have a search through this community for that. It will throw out some hits.

Are you sure, that you left enough Memory for ESXi itself? Also, are you sure, that there is no stress on the disk/controller (physical)?

Could you do a "regular" / plain install on the same host for a comparison?

0 Kudos
abkaiser
Contributor
Contributor

Jackobli,

The tape drive is okay - that's a tape drive physically installed in the server, connected as a VMWare iSCSI device.

>Are you sure, that you left enough Memory for ESXi itself?

How much should I have? For the memory setup, I used the "recommended" setting. Here are the specs:

ESXi Configuration tab:

Physical Memory:

Total: 3068.4 MB

System: --

Virtual Machines: 2721.0 MB

VM Summary tab:

Resources:

CPU usage: 2128 MHz

Host memory usage: 2.20GB

Guest memory usage: 410MB

(these last two fluctuate slightly)

VM Settings -> Virtual Machine Properties -> Memory:

http://www.andybrain.com/temp/pic4-memory.bmp

>Also, are you sure, that there is no stress on the disk/controller (physical)?

I am fairly sure. There are no attention lights on for the server, and the 3 15K drives striped with RAID-5 is probably a little overkill for this environment. With a VM install, I didn't think it would be an issue.

>Could you do a "regular" / plain install on the same host for a comparison?

This is probably not an option for us. The device is in production mode now, and removing it again would be problematic.

Andy

0 Kudos
khill
Contributor
Contributor

As long as you're looking for odd hardware, have you tried checking for hidden devices? Not sure how much effect they might have, but I suppose it couldn't hurt.

To show them, open a command prompt and type in the following;

c:\>set devmgr_show_nonpresent_devices=1

Then launch the device manager from the same command window with;

c:\>devmgmt.msc

When the device manager launches, click "view", then choose the "show hidden devices" option. Anything that was hidden will come up as slightly greyed out device in the tree. For my VM's I usually remove anything that won't be needed after a P2V import. I usually end up with multiple network adapters as well as multiple CPUs. All disabled since they no longer exist.

Of course be careful about this as you said it's a production system. I usually do this to a system during the import process but before the production change over to the virtual system.

Maybe something there is causing an issue?

Edit: I just saw Jackobli's comment about removing hidden devices slowing systems down. (I didn't read closely enough the first time). While I haven't had that happen to me, DEFINITELY be careful about it if there are other comments about it causing problems.

0 Kudos
abkaiser
Contributor
Contributor

>I just saw Jackobli's comment about removing hidden devices slowing systems down.

I think I need clarification on this, as the wording confused me slightly,

Jackobli (or anyone in the know), are you saying that hidden devices could slow the system down? Or are you saying REMOVING hidden devices could slow the system down?

Andy

0 Kudos
Jackobli
Virtuoso
Virtuoso

Sorry Andy! Sometimes, I should re-read, what I am writing.

There have been postings accusing hidden devices for slowing down the guest.

Removing them might help or at least eliminate that possibility.

But I agree with others, that there is (as always) a risk in adjusting things in the device manager.

Still, I wouldn't rule out disk i/o or other devices. Perhaps you should do one after the other, just to narrow it down:

. Reduce memory to, say 2 GByte (thus making the vswap smaller and leaving more memory for ESXi itself)

. Disconnect any removable devices like floppy, CD/DVD and also the Tape

. have a look to the different performance tabs of ESXi (disk, network...) while doing operations in the guest (like the "open Explorer" you wrote)

If you are trying to add screenshots, don't use bmp, they are blocked. Use PNG or JPG instead.

0 Kudos
abkaiser
Contributor
Contributor

Still trying out the suggestions, everyone, but wanted to update with an interesting find:

If I boot the server into Safe Mode, the performance is PERFECT. No problems.

I'm now experimenting with MSCONFIG to disable various things and reboot.

Andy

0 Kudos