Hello,
I'm having problems with ESXi (3.5 U2 latest, both embedded and installable) on three different hosts. Hardware is HP DL380 G5. Both NICs on every server are connected to 1000FDX ports without any duplex issues. ESXi network configuration is the default: both vmnic0 and vmnic1 are used for VM Network and Management Network. Switches show no errors on the ports.
VM Network is not showing any performance problems. I'm getting steady 30-40MB/s to and from guest machines.
Accessing the management network (copying to datastore, converter access, downloading VI client etc.) is painfully slow. Ranging between 100kB/s to 3MB/s - usually around 1MB/s.. needless to say that this is very frustrating when for example converting existing virtual machines to the ESXi hosts.
Any idea where to start looking for a solution?
Update to perc3 card:
Nope I managed to get a Linux VM to load onto the server where I ran many IO tests for net/disk/memory performance. I didn't see any real problems from this server to any other. The only problem is that when I copy anything over the management NIC it get dog slow. This is still happening. I am still getting IO errors when trying to copy to the esxi server. My network team says there is nothing wrong on the network.
dragin33: Where are you seeing the IO errors? ...system log files?
I strongly doubt there is a network problem...just like I strongly doubt any VMWare people are actually reading this (...if so, why have they responded??). I do believer however, that the Management network is purposefully capped at some arbitrary limit imposed by VMWare.
After all - it is a management interface and ideally, you would have the imaged stored on a SAN and there are other means of backing up and protecting the images on the SAN. I'm not defending it...because I am a victim too. ...I'm just calling it like I see it.
Suggestion: Use the Veeam's FastSCP. The public beta release for 3.0 supports ESXi - and I am getting a sustained 30MBytes/sec with their software. With WinSCP I only get about 3MBytes/sec and PSCP.exe fails after 2-3 mins. You can find the download in the Forum.
Hmm, how about trying to use a Linux NFS server? Takes about 15 minutes to install it and get the NFS server up and running.
I've tried NFS on Linux and Microsoft (SFU)...same performance. ALTHOUGH...one the image is there, it loads and runs quickly - and file copies to the VM (ie, the filesystem in the running VM) are reasonable and "as expected".
It's just that copying over the mangement interface really stinks.
What were your export settings for the Linux NFS server?
Hello,
excluding scp and nfs (and fastscp as it doesn't run on linx) what's to options to transfer files to esxi ?
I was thinking about using an external USB disks but it seems it's not suported. dvd are no option for large files
can you use a non management interface to transfer files ?
if not can you access the datastores from within a VM without using the management interface ?
other experiences with different transfers type ? what if I connect the esxi sever and the nfs server with a crossover cable (sorry I can't try this because I'm out of the office) ?
thanks
giuliano
I am seeing an actual error message pop up from the VMWare Management Console when I go into the datastore browser and try to copy a large file in or out. It just says IO Error. Sometimes the error comes imidiately some times it comes after a long time of copying (although slowly copying)
I would like to try fastscp but my network security peeps have it blocked.
I'm sorry to say I've just tried the veeam FastSCP and I'm not seeing any better speed.
Same here, I've got esx 3i installed with the CD on a Dell Poweredge 2950.
uploading to the datastore goes trough the management lan and seems to be capped at 2500KBps.
if I start a second upload with fastscp, the second upload uses about 4000KBps and looking at the performance monitor I can see the network utilization go up to 6500KBps.
If anyone can come up with a soltion that would be nice altough my 200GB VM is almost copied after 3 days
edit: i'm running v 130755 atm
I had the same performance results as you, but between to ESXi servers.
I tried several weekends to move a large VM, but had to abort because of lack of time.
I tried, Converter, FastSCP, SCP and so on.
My solution was to install an eval of VC and import the two ESXi's into it.
Then I did a datastore copy (not storage migration) between the two.
My transfer speed went from about 6MB/s to 35-40MB/s
In my opinion this proves that the ESXi can perform well in the management interface, but it seems that it behaves different when VC is involved.
I wonder if it is really limited by the drivers in ESXi. . .also if
capped, it must be capped at a percentage or host CPU or something. .
.because I see varying speeds reported.
I've logged a support request about this (SR # 1155278901). In my lab I have ESXi (latest build) managed by VC 2.5 U2. I saw better speeds (2x) when the VI client was connected via VC then when it was connected directly to ESXi. Now in both cases, the VI client would connect directly to the ESXi host to transfer the files, but when the transfer was initiated from the VC connection it was much better. I saw the same thing with FastSCP 3.0 (beta).
Download
direct - 46 seconds - 595 MB file 13 MB/s
via VC - 21 second - 595 MB file 28 MB/s
Upload
direct - 648 seconds - 5367 MB file - 8.3 MB/s
via VC - 320 seconds - 5367 MB file - 16.8 MB/s
I ran into the same issue and had a big problem cloning a typical image of 20GB into
several IBM BladeCenterH. I wrote an ash script using rsync & scp and cloning images
in cascade. I observed that cascading 3 hosts each time gave me the best performance.
My observed performance is approx. 30-40MB/s
I am still wondering that this thread has no comments from the vmware support people so far.
@Vmware: Is there someone that can clarify this issue and advice the community what is the best way to
import/copying/cloning images with ESXi?
Dave,
Any update on the ticket you opened? I recently updated to the 14129 version but I don't think there is much improvement.
They haven't been able to replicate it themselves. Have you seen the same problem?
Same problem with the latest version
Dave:
They haven't been able to duplicate this problem?? Are they smoking something or have they been reading the hundreds of threads all over the Internet that (seemingly) everyone has this SAME issue??
Oh well...I don't think they want to officially admit that they have engineered some type of "BW-cap" into the VMKernel interface. I could understand if they dedicate the management interface...BUT...if we add another VMKernel interface we should be able to configure the VMKernel interfaces as either a "MGT" interface or a "DATA" interface. It makes sense that the management interface isn't flooded with data traffic...BUT @VMWARE SHOULD GIVE US AN ALTERNATIVE!!
This is for benefit of someone at VMWare who has lost their way and found themselves in the middle of our discussion on the "Management Interface".
First, a description of the system:
ESXi 3.5.0 Build 123629
DL350 G5 with 2 Dual-Core 1.6GHz processors
32GB RAM
P400 Controller with 128MB Cache
Array #1: 3 76G 15k SAS Drives (Raid-5)
Array #2: 3 142G 15k SAS Drives (Raid-5)
6 Gig NICs
EMC Celerra NS20 NAS
146G 15k Drives configured as a 4+1 Performance array
We run six VMs locally on the local array #1. Nightly, we use a script to snap the images and then "hot clone" the image" to the 273G local array #2. We then copy the "cloned image" to the EMC NS20 NFS array and then we delete the snapshots. We keep two copies of the images on local storage and two copies on the EMC NS20 NFS array.
...and for the finale!
Size of VMs on Array #1: 131 GBytes
Time to clone to Array #2: 35.8 minutes
Speed MB/sec (local copy): 61.1 MB/sec
Time to clone to NS20 NFS Array: 41.5 minutes
Speed MB/sec (local copy): 52.7 MB/sec
So, there is only a about a 8.4Mb/sec difference between copying between the local disk volumes vs copying to the NFS share. Some one that is more of "hardware guru" may can explain or justify these numbers being realistic. It isn't bad...BUT...it isn't great either.
Cloning 131GByte of images in 35 minutes locally vs 41 minutes "over the wire" to NFS. Can someone validate that is or is not a reasonable level of performance?
That sounds like good results to me. Most people are complaining about
getting around 4MB/s. We can usually get 20MB/s to our NFS from our
ESXi server. . .we get 50MB/s from another NFS client.
These appear to be very good results you are reporting. I suspect the difference is in the fact that you are running a script, while I believe that most of us in this thread are attempting to use the Virtual Applicance Export function or the Datastore Browser to backup the VM to external storage.
I guess the bottom-line is, what utility are you using to actually clone the VM?
Lou
P.S. My test VM is approximately the same size, but it takes 5+ hours to complete using the Export function or Database Browser. And that's only if it doesn't terminate and die with an I/O error in the middle of the operation.