VMware Cloud Community
klamerus
Contributor
Contributor
Jump to solution

Backup of Running Systems

We need to do backup of systems that have a 24x7 availability requirement. Usage may be "lower" during late evening/early morning, but the images cannot be down.

What are our options?

At this time, we backup using s/w within the images to backup servers (across the LAN in our data center).

Would love to be able to just copy the snapshots (at the host level) and recover those in the case of need.

0 Kudos
1 Solution

Accepted Solutions
weinstein5
Immortal
Immortal
Jump to solution

That is one of the most common misconceptions about a snapshot - it does not make a copy of the virtual disk but rather locks the virtual disk and redirects the block level changes to a snapshot log - so it is a very quick

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

View solution in original post

0 Kudos
18 Replies
weinstein5
Immortal
Immortal
Jump to solution

The other products that can provide live backups in clude VMware Consolidated Backup, Vizioncore's vRanger, and PHD Technologies' esXpress - all of these will allow you to back your VM while it is running - all three deliver as promised my favorite of the bunch is vRanger -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
ChrisDearden
Expert
Expert
Jump to solution

Have you had a look at either VCB or Visbu from xtravirt.com - they'll copy the snapshot from the VM for an image level backup.

If this post has been useful , please consider awarding points. @chrisdearden http://jfvi.co.uk http://vsoup.net
0 Kudos
malaysiavm
Expert
Expert
Jump to solution

you can try the VM explorer from Trilead. I am currently running the evaluation on that and it works pretty fine.

Malaysia VMware Communities -

Craig vExpert 2009 & 2010 Netapp NCIE, NCDA 8.0.1 Malaysia VMware Communities - http://www.malaysiavm.com
0 Kudos
klamerus
Contributor
Contributor
Jump to solution

So far as I can tell from documentation the vmware consolidated backup needs to temporarily halt the guest.

At least that's my understanding of their quiescing the guest.

Perhaps, someone can tell me if that's wrong.

Anyway, I'd like a recommendation of something someone is actually using and knows works.

0 Kudos
weinstein5
Immortal
Immortal
Jump to solution

Yes it does quiesce the machine - but this is a fraction of a second and will not be noticed - it is quiesced long enough just to commit impending writes to disk and take a snap shot - this allows VCB access to the virtual disk -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
klamerus
Contributor
Contributor
Jump to solution

Thanks, it wasn't clear from EMC docs what exactly this was. In WebSphere MQ quiesing kinda means sync/flush/pause.

So long as there is no noticeable impact to running apps (except for maybe the most intensive), then it would work for us.

Have you use this backup software? Is it working there?

0 Kudos
klamerus
Contributor
Contributor
Jump to solution

If the image (the data drives) amount to 100 GB will it really be this quick?

0 Kudos
weinstein5
Immortal
Immortal
Jump to solution

That is one of the most common misconceptions about a snapshot - it does not make a copy of the virtual disk but rather locks the virtual disk and redirects the block level changes to a snapshot log - so it is a very quick

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
ericsl
Enthusiast
Enthusiast
Jump to solution

klamerus,

There's more here than meets the eye. You need to do backups, yes, but what about restores? How would that be handled?

What is your Recovery Point Objective? At what point in time can I restore from? Anywhere from a few seconds ago (CDP) to a year ago(Retention Policy)? What kind of data? SQL? Backup the OS or just data?

What is your Recovery Time Objective? How long will it take to restore? If you have large volumes of data stores that need to be restored that can take a long time...unless you utilize software than can do it quicker.

If you need 24x7 availability you should know if advance how long it is going to take you to recover from any type of failure.

Allright, I admit it. I do this stuff for a living....but there are a lot of questions that need to be answered.

Reply if you want professional help.

Eric

0 Kudos
klamerus
Contributor
Contributor
Jump to solution

Our RTO and RPO vary by system, however the general RTO is 48 hours and general RPO is 24 for commodity windows servers.

Having said that, I fully understand this area. I'm ITIL certified infrastructure technical architect working at an operationally superior fortune 50 company.

Thanks.

0 Kudos
klamerus
Contributor
Contributor
Jump to solution

At what point are the changes reconciled back into the master?

0 Kudos
ericsl
Enthusiast
Enthusiast
Jump to solution

klamerus,

So these are "commodity windows servers" you are refering to? The data is not that important that the organizationally you can lose a days worth of it? And then take 2 days to get that back?

Is VCB the only backup you'll be doing? Nothing at the app level? Are you more concerned about the OS or data?

My question with VCB is what is the process for accessing the backup data? Do you have to start the virtual machine to get to files to restore them? I suppose you could use vmdk mount. How does VCB handle incremental backups?

Just curious...

Eric

0 Kudos
weinstein5
Immortal
Immortal
Jump to solution

once the back is complete as part of the clean up script the changes are committed to the virtual disk -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
ericsl
Enthusiast
Enthusiast
Jump to solution

You said, "If the image (the data drives) amount to 100 GB will it really be this quick?"

Doesn't VCB need to backup the base image as well as the snapshots? Otherwise how would you restore?

Eric

0 Kudos
khughes
Virtuoso
Virtuoso
Jump to solution

Here is a good document on the comparisons of backup options:

I'm not sure if I saw if you were running ESX or ESXi, this article is more towards ESX. If you're running ESXi the vendors should be getting their software able to back those up pretty soon. We run esXpress, usually scheduled at night time where the load is really light. It works great for us, no downtime, quick backups of large VM's.

  • Kyle

-- Kyle "RParker wrote: I guess I was wrong, everything CAN be virtualized "
0 Kudos
weinstein5
Immortal
Immortal
Jump to solution

no it just backs up the base image - the restore point is to the instant that the snap shot was taken - for example VCB starts at 0100 at 0102 the snapshot is taken the back up runs and finishes at 0300 - the files you have backed up will be time stamped prior to 0102 - the type of restore will depend on the O/S and this is one of the issues with VCB - if it is a Windows O/S you can do a file level backup but if it is Linux all you can do is a full image back up -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
ericsl
Enthusiast
Enthusiast
Jump to solution

So in this scenario do you have access to the snapshots taken within ESX? Otherwise you only have one snapshot to restore from. Also, in my mind, it seems kind of senselesss to do a full backup of a VMware data store everynight. What if it is 500GB? A block level incremental backup at the OS level would make more sense and give you more control of retention policy, etc.. However, the OS is a different story. It would be MUCH easier to restore that from a native VMDK file...

Eric

0 Kudos
klamerus
Contributor
Contributor
Jump to solution

Some of the images are as large as 100 GB, but more are in the 10-20 GB range.

In most situations I know where VM is used NAS is also used (except for desktop fun and games). We use NetApp devices. We backup with backup servers with their own NAS and then go to tape (for off-site).

There are several benefits (some drawbacks) to backing up the image (outside ESX, not within).

One benefit is simplicy.

We use TSM at the moment. This requires creating a master backup followed by differentials backup. Restoring requires first building the base image then the master, then all the differentials.

The problems with this follow:

Our experience is that the base images aren't always done properly.

If you don't re-master regularly you end up having to apply lots of differentials. Any failures along the way and you're toast. That includes missing tapes, bad tapes, bad procedures (applying them).

Differentials still take time in proportion to the # of files in the server image (to check if they need to be backed up) - so servers with lots of files still take time to scan. Luckily it doesn't require all the time to copy the files across the network. There's also the time to checksum the files copied.

TSM saves time for sure, lots of it, but we currently make new full copies once / month and once / week with some systems.

Also, the TSM libraries aren't well segregated so far as tapes for which applications, especially for the differentials.

All of this is better than what we used to do.

If we copy the VMs via the VMS backup software it's a lot simpler. Each image is complete and for smaller servers not too large. Still it will take more time storage than TSM.

One benefit is RTO. Recovering a server is the total length of time it takes to recover the limited set of files for that images (maybe a dozen files). We clone images all the time and it's about a half hour operation (at the high end). It takes far longer than that to install a base OS on a server and then apply the master followed by all the differentials. That's nearly a 12 hour operation if all goes well - and it frequently doesn't for the reasons above.

Another benefit is access controls. We can have people who do not have access backup VM images without their having to have access to the content of the images. This would help us with issues for both SOX and Export controls. The people doing the backups do not need administrative access at the OS level (or above).

Another benefit is scheduling. As much as we'd like to know and control exactly when a backups is made, the fact that most people (including us) have backup servers backing up numerous other servers results in schedule "drift" over time. We have found that systems we wanted backed up at 10:00 for some reason (perhaps server-to-server depencendies) end up getting backed-up at "who knows when" within a several hour window. If we can snapshot the image, the backup will be of that at exactly the time that is done. We don't want the multi-hour drift (to limit log file growth), but if we have it, we can survive.

Another benefit is DR. We use a external host for off-site DR. Their servers are not the same model as our and their base kits are necessarily different. Both we and they have actually a number of kits due to owning many models of servers from several suppliers. We have run into issues with all systems having compatible versions of TSM and backups in addition to the OSes.

There are also additional issues with TSM regarding what it can/can't backup so far as active files, registry, and other key items. Snapshots of VMs don't have these.

There are a number of other issues, but I have to stop somewhere.

It seems that the real major downside to backup up at the host level is really the size of the backups themselves and the impact of sending those across our backup network to tape (which is how we do this). That's nothing to sneeze at, but this is worth a good look on our part.

0 Kudos