VMware Horizon Community
snthaoeu
Enthusiast
Enthusiast

pcoip_win32_server.exe NOT Using Enough vCPU

I noticed in Task Manager that no matter how many vCPU my VM is allocated, the pcoip_win32_server.exe process only ever uses the equivalent of 1 vCPU's resource. If I have two vCPU's provisioned, the process takes up to 50% of CPU resources. If I have 16 vCPUs, the process maxes out at 1/16 = 6-7%.

We also have multi-monitor clients, with up to four 1920x1200 monitors. The pcoip_win32_server.exe process takes just as much (little) CPU resources when serving up graphics for a single monitor as it does for four.

For example, if an application such as Excel is running at 12 fps on one monitor, that frame rate drops to half that with two monitors, and half again (yes, 3 fps) when Excel is stretched across four monitors. Switching to a single 4K (3840x2160) monitor results in roughly 1/3 the fps since it is serving up roughly three monitor's worth of pixels.

Our clients vary from a V1200-QP zero client to full-blown desktops running Horizon View clients both over LAN and WAN.

Is this expected behavior for a current, VMware-compatible ESXi server? (Which is precisely what we have, two of them from HP in fact). In other words, should we expect this sort of "low" CPU usage for a software encoding solution rather than using dedicated hardware?

12 Replies
douglasarcidino
Hot Shot
Hot Shot

Have you performed any of the tweaks in http://www.vmware.com/files/pdf/view/vmware-horizon-view-best-practices-performance-study.pdf ?

It's very important to tweak it down. I do know that there is a still unresolved bug that causes screen lag while going line by line through documents. If you post your GPOs for the Vmware View adm templates I can give you some pointers. If you open the sessions over RDP, do you still get the frame rate issues?

If you found this reply helpful, please mark as answer VCP-DCV 4/5/6 VCP-DTM 5/6
snthaoeu
Enthusiast
Enthusiast

Have you performed any of the tweaks in http://www.vmware.com/files/pdf/view/vmware-horizon-view-best-practices-performance-study.pdf ?

We are in the process of going through this list yet again, but so far nothing obvious stands out. That is perhaps the most frustrating part of this, the other being VMWare's own reps telling us not to expect native speed even in Office applications across more than two monitors (which we still don't achieve), never mind four.

It's very important to tweak it down. I do know that there is a still unresolved bug that causes screen lag while going line by line through documents. If you post your GPOs for the Vmware View adm templates I can give you some pointers


Interesting, I was not aware of such a bug. Let me see if I can post the GPO: thanks for offering to look through them.

If you open the sessions over RDP, do you still get the frame rate issues?


So here's the rub: RDP scrolling in Excel is fast. So yes, RDP is indeed better than PCoIP in this narrow dimension. We all know how RDP differs from native desktop performance -- not the least being poor video performance compared to PCoIP -- but in this one highly relevant test (to us), PCoIP falls flat on its face. It's the one reason we don't already have a dozen ESXi servers replacing the desktops on our floor.


Let me see if I can get that GPO to you. Thanks again for responding: I'm guessing it's not correct that pcoip_win32_server.exe isn't scaling in CPU resources beyond 1/vCPU?

0 Kudos
douglasarcidino
Hot Shot
Hot Shot

Based on what you said, I expect you are seeing the same screen refresh bug that I filed a ticket for with VMware about 20 months ago. A good test is to scroll down a list line by line using the arrow keys on the keyboard. Another funny test you may try is to launch soundrecorder.exe via a login script and see if performance improves. We found that launching an audio or video app and having it run in the background improved performance. I know how stupid that sounds but give it a try.

If you found this reply helpful, please mark as answer VCP-DCV 4/5/6 VCP-DTM 5/6
douglasarcidino
Hot Shot
Hot Shot

One other thing to try instead of the sound recorder idea. Go into the view adm templates. Under PCoIP Session Variables open configure PCoIP image quality levels. Set it to enabled and configure the 3 options at min image quality 50, max initial quality 90, Max frame rate 120.

If you found this reply helpful, please mark as answer VCP-DCV 4/5/6 VCP-DTM 5/6
0 Kudos
snthaoeu
Enthusiast
Enthusiast

Thanks again for the ideas. I've tried:

- Opening Sound Recorder. No change

- Running an 1080p HD video in the background (via Chrome): definitely slower Excel scrolling performance. TX bandwidth just pegs at 50mbps and the pcoip_win32_server.exe never passes 7%.

- Being lazy and looking in the Registry (HKLM\SOFTWARE/\Wow6432Node\Policies\Teradici\PCoIP\pcoip_admin), we have set min = max quality at 100, and max framerate at 120. In fact the full list is:

\pcoip_admin

pcoip.device_bandwidth_floor: 900000

pcoip.enable_build_to_lossless: 1

pcoip.enable_server_clipboard: 0

pcoip.enable_vchan: 1

pcoip.maximum_initial_image_quality: 100

pcoip.minimum_image_quality: 100

pcoip_admin_defaults

pcoip.device_bandwidth_floor: 1000000

pcoip.enable_build_to_lossless: 1

pcoip.maximum_frame_rate: 120

pcoip.maximum_initial_image_quality: 80

pcoip.minimum_image_quality: 70

pcoip.transport_session_priority: 1

pcoip.use_client_img_settings: 1

Incidentally, we have gigabit LAN and are upgrading the switches to 10GBASE-T. Also this is a test 28-core ESXi server on which basically I'm the only user. That's why the bandwidth floor is set so high.

0 Kudos
snthaoeu
Enthusiast
Enthusiast

BTW, do you still have the description for the ticket you submitted to VMWare? That sounds like exactly what we're experiencing Excel-wise, though the overall lack of pcoip_win32_server.exe scaling is still the major issue.

0 Kudos
douglasarcidino
Hot Shot
Hot Shot

"We have a production View environment that we are deploying and getting ready to roll out enterprise wide. Right now we are experiencing extremely slow performance in text based applications like Microsoft Excel and Notepad only when using PCoIP. If we connect to the VM over RDP, there is no issue. "

That's the text we used. Here is the KB about the bug VMware KB: Redraw lag in text-based applications occur in PCoIP Sessions

If you found this reply helpful, please mark as answer VCP-DCV 4/5/6 VCP-DTM 5/6
snthaoeu
Enthusiast
Enthusiast

Ah, I have seen this KB. Nice to meet its owner Smiley Happy

To the KB I can add that for line-by-line scrolling via keyboard on multi-monitor setups, the screens don't update "together". They are distinctly out-of-sync. This is greatly mitigated by adding the APEX 2800 offload card, but not eliminated and frankly I shouldn't be seeing these issues when I'm the only VM user on a 28-core ESXi server.

For the View environment in the KB, did you end up rolling it out and if so was there anything hardware-related you added to mitigate the issues?

And more relevantly: do you see pcoip_win32_server.exe CPU usage being anything greater than "single-threaded" on one vCPU?

0 Kudos
douglasarcidino
Hot Shot
Hot Shot

snthaoeu wrote:

Ah, I have seen this KB. Nice to meet its owner

To the KB I can add that for line-by-line scrolling via keyboard on multi-monitor setups, the screens don't update "together". They are distinctly out-of-sync. This is greatly mitigated by adding the APEX 2800 offload card, but not eliminated and frankly I shouldn't be seeing these issues when I'm the only VM user on a 28-core ESXi server.

For the View environment in the KB, did you end up rolling it out and if so was there anything hardware-related you added to mitigate the issues?

And more relevantly: do you see pcoip_win32_server.exe CPU usage being anything greater than "single-threaded" on one vCPU?

The Apex 2800 card ONLY works when have CPU over commit issues on your host FYI. I beta tested those cards and always found adding hosts was cheaper for what mediocre improvements we got.

We just implemented the sound recorder fix or the 120fps change because we needed HW version 8 or higher.

I run two vCPU desktops and rarely see usage of that service about 10 percent but I don't see it threading to multiple CPUs if that answers your question.

If you found this reply helpful, please mark as answer VCP-DCV 4/5/6 VCP-DTM 5/6
0 Kudos
snthaoeu
Enthusiast
Enthusiast

Sorry it took so long to post the GPO. I actually fretted about sanitizing it, but I can see nothing that isn't totally standard and non-proprietary.

In the registry we've played with all sorts of overrides for max this-or-that (900000-1000000 bandwidth floor, 120 fps, image quality up the wazoo, build-to-lossless on, etc) and are unable to get the performance we expect in regards even to keyboard scrolling of Office apps.

As for the APEX 2800 card, that's the funny thing. I understand it's not supposed to accelerate anything if we don't have CPU overload issues, which we don't because of the aforementioned 28-core 1 user test machine I've been luxuriating in. Yet, with our clients -- especially 10ZiG V1200-QP zero client -- it's unavoidably noticeable that Excel scrolling is much more in sync between monitors and much smoother than without the offload card. On four monitors we're still talking high single-digit fps, which sucks, but it's way better than the 1-3 fps we get without the APEX card.

I think we just have an underlying issue somewhere in our ESXi server or network architecture. But I keep suspecting that VMWare just doesn't adequately support more than 2 monitors, period.

Thanks so much for your help. If you see anything in the GPO that looks like a bonehead setting, I'd love to hear about it.

0 Kudos
kazimnaim
Contributor
Contributor

Hi There

Need some info in line of discussions on this thread

I am trying to achieve a POC for designer VDIs

Hardware is Super Micro 4028 Server

512gb RAM and E5-2697 16 core Dual proc

Running horizon view 7. Nvidia Grid GPU Tesla M60 in graphics mode.

on VMs (4vcpu, 16gb ram) (using only fat client 1920x1080 x 2 Screens) full screen 1080 video stutters even on single screen. already done the registry tweaks for PCOIP

Is it expected?.. will APEX 2800 will give some benefits

Any comments?

0 Kudos
RandyDGroves
Enthusiast
Enthusiast

Just saw this post. Don't know if you are still experiencing the issue.

Looking at these registry settings, setting max and min image quality = 100 will slow down the encoding with little noticeable benefit to full-screen compressed video. So, unless you are dealing with raw-pixel videos, you should drop the pcoip.maximum_initial_image_quality = 90 and keep the pcoip.minimum_image_quality = 50. Also,pcoip.maximum_frame_rate = 30 will eliminate unnecessary compute cycles since most video content is 24 or 30 fps. 120 fps sample rate is excessive.

Finally, be sure that your server power setting is set to "high performance" instead of the default (https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=10182...

Even with these changes, the pcoip_win32_server.exe is single threaded and will max out around 60 Million Changed Pixels per Second (Mpps) per VM. The PCoIP Hardware accelerator card will support up to 80Mpps per Display which means 160Mpps for a dual display system so that it can improve your experience even if there is only one VM on the system.

0 Kudos