VMware Cloud Community
maleitch
Enthusiast
Enthusiast
Jump to solution

VDR gets me in hot water - ridiculous restore time

My smaller company spent a considerable amount of money on a VM/SAN solution and the powers that be are pretty upset with the results we experienced last night.

I have an older Windows 2003 vm that I converted from a physical server and its "thick" disk is 67GB.  Currently, the VDR backups reside on a generic NAS with a single GB connection  The VDR connects to the NAS via a windows share on another server as I believe that is my only option for network storage.  VMware farm is connected to EMC SAN via a dedicated 20GB switch "fabric".

The restore for this VM took 8 hours and management is not happy about the turn around time.

I know I have given very superficial details, but could I hear from someone that has performed a restore from a windows file share and what their performance was?

0 Kudos
1 Solution

Accepted Solutions
wcpreston
Contributor
Contributor
Jump to solution

The design seems to suggest that the data is moving twice: once from the vmware host to the windows box, then once from the windows box to the CIFS share.  is that true?  Can you change that?

67GB in 8 hours is 2.5 MB/s, which is pretty crappy throughput.  Have you investigated the time it takes simply to copy the 67GB from the NAS share back down to the Windows box in the middle and without doing a restore -- just copying the file?  If that's 8 hours then you have your culprit.  If you don't have space on the windows box, then I'd be copying from the NAS share TO the NAS share with the windows box in the middle. Again, if that takes 8 hours you have your culprit.

Then I would get a VM on this machine and copy 67 GB from that NAS share directly to it.

Then copy 67 GB from the Windows box in the middle to the VM.

You getting my drift?  If you don't have room for 67GB, find something smaller you CAN fit and move back and forth each time.  See where your slow link is.

Just curious: was this a test restore, or a production restore that was never tested?

View solution in original post

0 Kudos
4 Replies
phykell
Enthusiast
Enthusiast
Jump to solution

If you're using a Windows Share, you may be able to use the NAS performance utility available from Intel:

http://www.intel.com/products/server/storage/NAS_Perf_Toolkit.htm

The results may surprise you as you will probably find that your GB connection is nowhere near its capacity, and that the bottleneck is the NAS itself.

Can you use an NFS volume instead of the CIFS one and does your current CIFS volume exceed the recommended 500 GB limit? You may run into performance issues if you exceed the 500 GB limit on a CIFS volume.

Can you backup to another location on your SAN or (understandably) do you need the data to be backed up to an alternative device?

Are you running VDR 2.0 and are your hosts at least 4.0? VDR 2.0 can be considerably faster than previous versions and VDR is faster for 4.0 hosts and later.

As a fix, it might be a better idea to separate your O/S and data; that way, your VDR can have a smaller system disk which will restore quickly. If the data is on another system, via a network share for example or a separate VMDK, that may mitigate the current 8 hour turnaround time. If your O/S becomes unusable for example, you can simply restore your O/S drive and your data drive is immediately available. If your data drive is corrupted for some reason, at least the system is still available to at least some users while you restore and there are other, potentially more efficient, backup strategies for your data.

wcpreston
Contributor
Contributor
Jump to solution

The design seems to suggest that the data is moving twice: once from the vmware host to the windows box, then once from the windows box to the CIFS share.  is that true?  Can you change that?

67GB in 8 hours is 2.5 MB/s, which is pretty crappy throughput.  Have you investigated the time it takes simply to copy the 67GB from the NAS share back down to the Windows box in the middle and without doing a restore -- just copying the file?  If that's 8 hours then you have your culprit.  If you don't have space on the windows box, then I'd be copying from the NAS share TO the NAS share with the windows box in the middle. Again, if that takes 8 hours you have your culprit.

Then I would get a VM on this machine and copy 67 GB from that NAS share directly to it.

Then copy 67 GB from the Windows box in the middle to the VM.

You getting my drift?  If you don't have room for 67GB, find something smaller you CAN fit and move back and forth each time.  See where your slow link is.

Just curious: was this a test restore, or a production restore that was never tested?

0 Kudos
maleitch
Enthusiast
Enthusiast
Jump to solution

Thank you everyone for your replies and you were all correct as far as troubleshooting.  Just to follow up I did some regular file copy tests and they also took longer than expected, but not quite as bad as I was seeing during the restore process.  Contacted vm support are they suggested avoiding the SMB option and using NFS which I honestly did not know was an option.

When doing the restore via NFS the process took one hour and 45 minutes so worlds of improvement.  So my own ignorance was more to blame for this issue than any problem with VDR.

0 Kudos
phykell
Enthusiast
Enthusiast
Jump to solution

NFS - that's what I said! Smiley Happy

I'm using an NFS destination and a CIFS/SMB destination on my appliance and while I haven't tried a restore comparison, the integrity checks and backups seem to run much quicker overall on the NFS destination. Having said that, it's not necessarily that CIFS has worse performance generally than NFS (that's a whole area of debate), it's probably more to do with the fact that you can directly attach a Virtual Machine Disk Format (VMDK), residing on NFS, to the VDR appliance rather than having to network mount a CIFS/SMB file system via the VDR interface.

Incidentally, and for completeness, it's probably worth mentioning that VDR also supports Raw Device Mapping (RDM) for the attached local disk.

0 Kudos