VMware Cloud Community
nflnetwork29
Contributor
Contributor

CPU alarm keeps triggering .. Need help

Here is the alert I keep getting...

My question is . How do I find the source of the problem??

Target: 192.168.123.61

Previous Status: Yellow

New Status: Red

Alarm Definition:

([Yellow metric Is above 75%; Red metric Is above 90%])

Current values for metric/state:

Metric Usage = 100%

Description:

Alarm 'Host cpu usage' on 192.168.123.61 changed from Yellow to Red

0 Kudos
6 Replies
jrmunday
Commander
Commander

Since this is a host alarm, it means that you either have a more CPU workload from all of your allocated vCPU's than the host can handle or on or more VM's is contributing to this. Is this a constant alarm or is it only triggered at a specific time?

I would look at the CPU utilisation for all of the VM's on this host to see where the workload is coming from. If you can't identify a specific VM (one or more), then you probably have too many VM's on this host and need to move some off. What infrastructure do you have so we can identify some possible options?

Cheers,

Jon

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos
nflnetwork29
Contributor
Contributor

Hi Jon,

This host is hosting 26 view desktops and 4 templates (powered off)

it has 24 Logical Proc's.

The alarm is triggered every Monday at the same time.

What else do you suggest in order to track down the culprit? Maybe it can be safely ignored?

0 Kudos
rachelsg
Enthusiast
Enthusiast

Hi

How many vm running on host ?

Best way is to implement monitoring software .

0 Kudos
homerzzz
Hot Shot
Hot Shot

You can run ESXtop on the host during the Monday time period and see if CPU usage for any VMs or processes spikes up.

0 Kudos
jrmunday
Commander
Commander

Is there anything specific that executes at this specific time on Monday, perhaps an anti-virus definition update or some other scheduled task?

I would just setup some virtual machine CPU alarms that trigger an email alert when metrics go above your defined thresholds. If you get alerted from all VM's then you know it's not a specific one that's causing it. Depending on the time it happens, if it's not too unsociable, then simply open your performance charts and look to see what's happening at the time. What do the stats look like leading upto the time that this happens? what changed?

As suggested above, ESXTOP can also help you out, but you may find the performance charts easier on the eye.

Out of interest, what time does this occur? Is there anything significant about this time?

Cheers,

Jon

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos
nflnetwork29
Contributor
Contributor

Hi it's every Tuesday at 11PM.

There is also several Virtual Machines (Win XP) that get high CPU alerts as well as the following Device error:

Device naa.60080e50002dc2cc000001bc4fc767c2

performance has deteriorated. I/O latency

increased from average value of 1448

microseconds to 29004 microseconds.

warning

08/04/2014 11:52:49 PM

192.168.123.61

0 Kudos