Re: How do RDM's talk to disks

BrownUK · ‎01-30-2013

Bit of a strange title, I realise that an RDM is treated like a physical device but obvioulsy it is routed through the HBA. What I am trying to find is an article that discusses this in terms of the queues etc that any SCSI command would go through.

For a normal datastore the queue depth on a HBA is 32, does an RDM for example have it's own queue in the same way as a datastore ?, is there also queueing on the virtual machine in a software HBA ?

We have a lot of RDM's in out environment, a lot of Linux clustering and Windows MSCS, is there any limit on the number of RDM's that are supported on a host? in a datacenter?

There is not a lot of information out there, that I can find, anybody got any decent links? I have a lot of blanks to fill concerning RDM's

Cheers

Al

lenzker · ‎01-30-2013

You can present up to 256 luns to an ESXi host. ( http://www.vmware.com/pdf/vsphere5/r51/vsphere-51-configuration-maximums.pdf )

I just tried to summarize the storage stack in an vSphere environment in a blog article a few weeks ago. (I referenced to a bunch of good resources)

http://vxpertise.net/2013/01/esxi-and-its-storage-queues/

If you include a lun with the physical compatibility mode the scsi commands will go "untouched" (except of 1 command if I remember correctly) to the LUN. If you chose the virtual compatibility mode the scsi command will go through whole ESXi storage stack.

Since we have on the ESXi level a queue per lun, the raw device will have its own device queue to this lun.

VCP,VCAP-DCA,VCI -> https://twitter.com/lenzker -> http://vxpertise.net

Gkeerthy · ‎01-30-2013

regarding to your queries

For a normal datastore the queue depth on a HBA is 32, - The datastore queue depth is default 32 in esxi.esx, this can be increased but..normally not recommeded.

The queue depth can be modified using the Disk.SchedNumReqOutstanding advanced parameter. Values between 1 and 256 (concurrent commands) are acceptable. With out proper analysis it is not recomended to change this.

does an RDM for example have it's own queue in the same way as a datastore ?, sure it is also have ...

is there also queueing on the virtual machine in a software HBA, - in software hba the queueing will be in the LUN level no HBA level.

is there any limit on the number of RDM's that are supported on a host? in a datacenter? - the default limit of esx is 256 luns so this also applies to RDM

Also when dealing RDM these below facts is also useful

VM with Physical (Pass-Thru) RDMs (Powered On – Storage vMotion):

If I try to change the format to thin or thick, then no Storage vMotion allowed.
If I chose not to do any conversion, only the pRDM mapping file is moved from the source VMFS datastore to the destination VMFS datastore – the data stays on the original LUN.

VM with Virtual (non Pass-Thru) RDMs (Power On – Storage vMotion):

On a migrate, if I chose to covert the format in the advanced view, the vRDM is converted to a VMDK on the destination VMFS datastore.
If I chose not to do any conversion, only the vRDM mapping file is moved from the source VMFS datastore to the destination VMFS datastore – the data stays on the original LUN (same behaviour as pRDM)

VM with Physical (Pass-Thru) RDMs (Powered Off – Cold Migration):

On a migrate, if I chose to change the format (via the advanced view), the pRDM is converted to a VMDK on the destination VMFS datastore.
If I chose not to do any conversion, only the pRDM mapping file is moved from the source VMFS datastore to the destination VMFS datastore – the data stays on the original LUN

VM with Virtual (non Pass-Thru) RDMs (Power Off – Cold Migration):

On a migrate, if I chose to covert the format in the advanced view, the vRDM is converted to a VMDK on the destination VMFS datastore.
If I chose not to do any conversion, only the vRDM mapping file is moved from the source VMFS datastore to the destination VMFS datastore – the data stays on the original LUN (same behaviour as pRDM).

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)

BrownUK · ‎01-30-2013

Assuming these replies are right then some (mainly one) of the answers I find to be really worrying.

We have 40 ish datastores that are in use and about 150 RDM's, assuming normal queudepth of the HBA which we have not changed the virtual machines with RDM's have 4 times as many queue slots available for scsi commands.

Surely this is quite a serious imbalance and could lead to problems with the RDM access overwhelming the disk subsystem to the detriment of the disks that are held on the datastore.

Gkeerthy · ‎01-30-2013

can you clarify your statement "assuming normal queudepth of the HBA which we have not changed the virtual machines with RDM's have 4 times as many queue slots available for scsi commands."

Actually we are not supposed to change or alter the HBA or LUN queue depth from the ESX side, normally very rare and nobody does... if you have a DAS or SAN dedicated to only few VM or hosts like in the case of Exchange or SQL or Oracle clusters, we will caclulate the VM queue depth, iops, workload etc..lot of things we need to caclulate and plan at these cases we can change and optimize the queue depth.

but in large environment, with 100 + hosts.. and a BIG SAN.. and 500 + LUNS etc...it is not feasible to configure and mange the queue depth...and it will give negative result also.

RDM use case.. VMware it self says...there only 1 % advantege in the performance, and feature and capability wise VMFS is the best. If you have MSSQL, Redhat linux clusters with fencing, then we can use the RDM and also with SAN snapshot use cases we can use.

Apart from this RDM has limited capability

Designing, the storage is the very critical, proper capacity planning, data growth, storage IOPS sizing, the Array CACHE, if it is NFS/ISCSI the network design, etc...it all plays an important role.

So simply looking from the LUN queue depth angle, wont solve any problems, it may help you to get more IOPS response for a LUN and causing others to loose.

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)

Gkeerthy · ‎01-30-2013

http://www.yellow-bricks.com/2011/07/29/vmfs-5-lun-sizing/

http://frankdenneman.nl/2009/03/04/increasing-the-queue-depth/

http://partners.netapp.com/go/techontap/matl/san-boot.html

http://www.yellow-bricks.com/2009/07/07/max-amount-of-vms-per-vmfs-volume/

these above links will help you to get more info.

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)

BrownUK · ‎01-30-2013

I have read a lot of the articles and hence my queries. For datastores you would enable SIOC in case of the nosy neighbour problem to guarantee that all machines have equal rights to the queues on the HBA.

With RDM's which we are forced to use becuase of MSCS and Linux clustering there is no SIOC and nothing to stop a virtual machine getting given by default more access to the disks the more RDM's it has

Let me make the problem as I see it a bit clearer:-

1 host 5 virtual machines, 4 with one VMDK on one datastore, 1 with one VMDK and 10 RDM's, just the one HBA.

From what you have told me so far the 5 virtual machines are sharing a HBA with queuedepth 32, 32 slots to put scsi commands in for all 5 virtual machines.when accessing the datastore, correct? so each machine has roughly 5 slots in the queue to send out disk requests.

The final machine has 10 RDM disks, each RDM disk has a queue on the HBA of 32, so just for the RDM disks there are 320 slots in the HBA queue for SCSI commands.

This one machine with RDM's has therefore got a total of 325 slots in the various HBA queues available that it can make disk requests to it has 54 times as much potential capacity for disk requests than a machine that just uses a datastore.

It seems to me that the potential for this machine to drag down the whole infrastructure could be a significant problem over which you have no control.

Does this make sense and do you think that this could become a problem

BrownUK · ‎02-12-2013

Vmware have confirmed this to be an issue. In a contention disk situation RDM's will have much better access to the disk. If you have a datastore with !0 VM's on and you have a machine with one RDM the RDM will get 10 times that amount of queue depth than one VM on the datastore

Gkeerthy · ‎02-19-2013

RDM is always tricky, that is why it is recommended to use VMFS. I recently build even Oracle RAC with VMFS it went fine.

Control the IOPS and queue for RDM is tough

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)

All

How do RDM's talk to disks