VMware Cloud Community
HealthIsWhy
Contributor
Contributor

Up / Down Alerts for Servers and/or virtual machines in general

I am fairly new to VMWare overall and very new to vCOps/vROps.  I am also just learning about these communities and I did look around but was unable to find anything related to my "issue":  My apologies if it's here somewhere already and I didn't see it.

Has anyone else set up alerts for Up/Down?  Currently, my place of business receives alerts from another product when a server is unreachable.  I have been able to duplicate this within vROps 6.0, but I need to tweak it a bit and I am looking to see if anyone else has done this and if so - what they have their wait/cancel settings at.


Because I could not find a built-in method for this, this is what I did:


  • Within the Alert Definitions, I created a new Alert Definition
    • Standard setup for Name/Description, etc.
    • Selected Virtual Machine for Base Object Type
    • Set Impact to Health
    • Set Criticality to Critical
    • Set Alert Type to Network - Availablitlity
    • Left Wait Cycle set to 1
    • Left Cancel Cycle set to 1
    • Added a built in symptom definition
      • "Virtual machine overall packets dropped percentage is high"
      • By default, this symptom has it's own settings of:
        • Symptom.PNG

The alert seems to work - but I get around 45 alerts per minute on various machines.  I do realize that there may very well be an actual problem with the machines/networking,  However, I also thought I would check here before I go point fingers at anyone since I am, in fact, the new guy at the office.

Has anyone else set something like this up?  If you have, could you post your settings as well, or message me, please?

Thanks so much!

0 Kudos
6 Replies
mlebied
Enthusiast
Enthusiast

We had the same requirement for basic up/down polling, but could not effectively achieve in VCOPS for. We implemented OpenNMS which we found to be very easy to setup and maintain.

HealthIsWhy
Contributor
Contributor

Thanks for the reply.  Unfortunately, my task here is to consolidate alerts for disk space and for server up/down (network availability) into vROps.

Implementing another product isn't an option for us.  I was told that there was a way to do this, so I am hoping that is true and that I just need to tweak the alert a bit.

0 Kudos
gradinka
VMware Employee
VMware Employee

try increasing the cycles back to 3 or more.

I think you're getting spammed because a VM can be idle in terms of network activity for a few minutes and that is normal.

0 Kudos
HealthIsWhy
Contributor
Contributor

I tested the alert with this setting (wait cycle set to 3) and i am still getting spammed with alerts on machines even though i can ping and connect to them.  I think I must be using the wrong symptom - or that I am missing an additional symptom. 

I have the Hyperic Adapter installed and I did notdce that it has a built in "Object is Down" that seems to do exactly what I need it to.  The problem is that it only works with objects that have they Hyperic Agent installed.  I wish I could tell what that alert uses to determine a down status, but it won't let me edit it the definition.

0 Kudos
rgcda
Enthusiast
Enthusiast

Hyperic plugs into vROps and you can then put the Hyperic Agent on all the servers and monitor for the availability status of the agent through vROps and generate an alert from there.

0 Kudos
fbess
VMware Employee
VMware Employee

That is the drawback using just one indicator (symptom) as you describe. It there some other condition which might help?

E.g. number of packets send above a certain number or just increasing the number of cycles?

There are different reasons possible for a packet drops http://en.wikipedia.org/wiki/Packet_loss