VMware Cloud Community
vDanTheMan
Contributor
Contributor

All-Flash vSAN white-box home lab design

I am refreshing my home lab, and am considering different options for building the systems.  This home lab is fairly serious (for me), and I hope it will last for 4 or 5 years.  I plan on doing testing with big data, hadoop, ceph, and other disk-intensive technologies, but ESXi will remain the bottom layer OS.  I am planning on a three-system configuration, and am working through the vSAN setup.  I would like to be able to add to the storage over time as cost/GB comes down.  There are some new motherboards that have "Ultra" M.2 or U.2 interfaces that use four PCIv3 lanes for 32Gb/sec bandwidth.  In addition, those motherboards also support a large number of slower storage interfaces such as SATA3 or SAS.  My thought is that I could build disk groups using the faster M.2 and U.2 interfaces driving NVMe SSD's as the flash cache/write cache disk, with the remaining SSD disks in the disk group connected to the more numerous SATA3 or SAS interfaces.  However, if I were to limit myself to fewer drives, two of the motherboards below have three PCIv3 x4 M.2 and/or U.2 interfaces natively, and I could add more using PCI slot to M.2 cards, and could then run all NVMe drives without being limited to the slower speeds of SATA3, eSATA, or SAS.  But, I am unsure if an all NVMe configuration would produce better results than NVMe for cache with SATA3, eSATA, or SAS drives for the capacity layer, but there are obvious additional expansion capabilities enabled using the slower interfaces.

Here are the motherboards I am looking at:

ASRock X99 Extreme11 - X99 chipset, socket LGA 2011v3, supports 128GB DDR4 including ECC

10 SATA3 and 8 SAS3 vis LSI SAS 3008 controller, 2 eSATA, 2 Ultra M.2

ASRock > X99 Extreme11

ASRock Z170 Extreme7+ - Z170 chipset, socket 1151, supports 64G DDR4 including ECC

- 6 x SATA3 6.0 Gb/s Connectors by Intel® Z170,

- 4 x SATA3 6.0 Gb/s Connectors by ASMedia ASM1061, support NCQ, AHCI and Hot Plug

- 3 x SATA Express 10 Gb/s Connectors*

- 3 x Ultra M.2 Sockets,

ASRock > Z170 Extreme7+


ROG Maximus VIII Hero Alpha - socket LGA 2011v3, supports 64GB DDR4

2 x Ultra U.2, one Ultra M.2, 6 x SATA3

http://www.asus.com/ROG-Republic-Of-Gamers/ROG-MAXIMUS-VIII-HERO-ALPHA/


I would like to hear your thoughts on if an all-NVMe setup would perform much better than a NVMe cache tier in front of a SATA3, eSATA, or SAS capacity tier.  I note that ASRock claims to have reached 6.1 GB/s sustained transfer rates using the 18 x SATA3 ports and an additional 2.8 GB/s using the 2 x Ultra M.2 ports, reaching a total of 8.4GB/sec using IOMeter.  See the video here:

ASRock X99 Extreme11 - YouTube


0 Kudos
4 Replies
zdickinson
Expert
Expert

Good afternoon, I would guess that the all NVMe config would perform better under enough punishment.  Sounds like for your case, you might just be able to supply it.

I think for long term $$$, you may want to add a 4th node so that you can do the new, in v6.2, RAID-5 and not just the standard FTT = 1.  Lot's of space savings.  Add on the dedup and compression, and you'll have yourself a nice setup.

In addition, if you don't over commit memory, you can do an advanced setting called Sparse Swap.  VSAN 6.2 Part 5 - New Sparse VM Swap Object - CormacHogan.com

Thank you, Zach.

0 Kudos
vDanTheMan
Contributor
Contributor

If the all-NVMe config has higher performance, would you recommend building the first disk group as all-NVMe, and if I need more space, create a second disk group that uses the SATA3 and/or SAS ports?  This may effectively create two performance tiers.

I would love to add a fourth node for the reasons you mention, but this config is going to be expensive, at probably $5k to $10k range.  There is a significant WAF (Wife Acceptance Factor) to consider.  However, I already have the rack, UPS, 1G Ethernet switch, 10G Ethernet switch, 40G Infiniband switch, and three Mellanox ConnectX-3 dual-port 10/40G Ethernet or 40/56G Infiniband FDR (configurable per port) VPI adapters.  My plan is to run the vSAN network on 10G Ethernet and the other ESXi networks across 40G Infiniband using IPoIB.  I have heard that vSAN is not supported on IPoIB, although I might give it a try for curiosity's sake.  Just the noise and power from the three planned nodes plus switches will be high, so I am going to have to sound-proof the small room in the basement this equipment runs in (A whole other challenge... keeping a small room sound-proof and cool without a secondary A/C system... thank God it's in the basement).  However, given enough time after building the cluster (like a year or two), I may get the wife to accept a fourth node.

Love the sparse VM Swap Object setting.  Will definitely do that.  Thanks for the link.

0 Kudos
zdickinson
Expert
Expert

Good morning, to answer your question about adding a second disk group with SATA and/or SAS mixed with NVMe...  You can't create tiers within vSAN.  You can only have 1 vSAN datastore per cluster and that datastore will spread across all the diskgroups, so there is no telling where a piece of a VM will reside.  I would say that would be bringing your NVMe performance down to the SATA/SAS.

A question.  You mention Hadoop and few other techs that I have heard of, but have no experience with.  Do you have an entry point into the Hadoop, Docker, Container world?  I know that's where the future is, but am having a hard time getting a handhold.  Thank you, Zach.

0 Kudos
vDanTheMan
Contributor
Contributor

Zach, good point on there only being one tier/datastore per cluster.  I wish I had the time/money to do some testing of how adding SATA/SAS drives would impact overall performance of an all-NVMe array.  Based on that, perhaps I should focus on all-NVMe.  With the motherboards mentioned having two or three "Ultra" M.2 or U.2 interfaces, I can start with that, and then add PCI to M.2 cards in the remaining slots, like this: How to install Samsung 950 PRO M.2 SSD in a PCIe slot - tested with Supermicro 5028D-TN4T & Lycom DT...  I will have to reserve one slot for the Mellanox ConnectX-3 card, which is an x8 card.  I may have to reserve another slot for a video card.  Depending on the motherboard, and processor, there is also likely to be contention for available PCI lanes.  Each Ultra M.2 or U.2 interface consumes 4 lanes, and the Mellanox card will use 8 lanes, with the video card consuming some additional lanes.  Ultimately, PCI lane contention can limit the peak I/O of a system, although reaching the I/O limit of recent processors with real world applications would truly be an impressive feat.  I note that the ASRock x99 Extreme11 has 2 x embedded PLX PEX 8747 PCI switch chips.  This allows that motherboard to funnel a lot of lanes to the CPU.  However, some of the PCI lanes are reserved by the chipset I believe, which are connected to the SATA/SAS and LSI 3008 controler.  That could mean that total overall bandwidth to the CPU would be increased by using some of the SATA/SAS connections.  Some of the motherboards indicate that the Ultra M.2/U.2 interfaces use some of the same PCI lanes as the SATA/SAS interfaces and preclude using them at the same time.  I will have to look into that further to understand how the total bandwidth is split.

With respect to your question about the technologies I am planning on learning/testing, I am focusing on a few general themes.  I don't have any particular pre-existing connections into these technologies, but have been learning/reading about them for some time.  I want to understand technologies that enable continuous delivery, customer/business/developer-driven consumption of infrastructure services with automatic costing/billing, automation of infrastructure configuration based on selection from application catalogs, big data databases and storage technologies, and analysis of big data including statistical and classification methods.  Technologies of interest include:

Open source virtualization - OpenStack / VMWare Integrated OpenStack

Containers - Docker / Photon / VMWare Integrated Containers

Infrastructure Automation, Billing, Chargeback - vRealize Suite

Configuration Automation - Chef / Ansible / Puppet / Salt (I will probably focus on Chef)

Continuous Delivery - Jenkins / Maven / Ant

NoSQL Databases - Cassandra / MongoDB / DynamoDB / HBase / Redis

Hadoop - Cloudera / Hortonworks / MapR

Distributed shared-nothing Storage handling block, file, and object services - Ceph

Statistical analysis and classification - R / MatLAB / Maple / Mathematica / Mahout

0 Kudos