Read: What Does it Take to Run VCF 9?

I have had so many conversations around this topic, I wanted to capture my thought process here, and avoid injecting operational or performance risk, as we take our first steps with VCF.

So, let鈥檚 dive into the current offerings from Broadcom. We are only going to be discussing their primary offering鈥.VMware Cloud Foundation (VCF).

VCF, which is what Broadcom leads with, and鈥VF (VMware vSphere Foundation), which exists, but is not the topic of this post.

We are not talking about every single SKU or offering they have, as there are quite a few additional 鈥渁dd-on鈥 licenses that exist for VCF. None of these add-on licenses are being discussed in the post, in any way.
BUT, you SHOULD discuss 鈥渁dd-ons鈥 as you may want/need some.

And don’t forget. vSphere 8 End of Support is October 2027. That鈥檚 only 19 months away鈥.start planning your approach now! The only way to vSphere 9 is VCF (and potentially VVF).

For the purposes of this post, we are going to sit in the role of an enterprise architect, who has to size new hardware required for a new application stack for the business that drives the business, same as if we are having this discussion about rolling out SAP, Oracle E-Business Suite, or PeopleSoft. We wouldn鈥檛 want to introduce unnecessary risk for those applications, right?

What is VCF?

So let鈥檚 start with what VCF is, and what it takes to deploy and run VCF. VCF is a private-cloud platform. You do not get to pick and choose the individual components.

Just like when you go to buy a car鈥.if you want a moonroof, and it is only available in the 鈥渢ouring edition鈥 package of the car, you get the touring edition. Don鈥檛 want the heated steering wheel or seats? Well, they came with the touring edition, so you can either use them or not鈥.but they came with the touring edition.

You don鈥檛 have to use EVERYTHING that came with the touring edition, but I鈥檒l bet you appreciate those heated seats on nights when its 7掳F.

What Does VCF (The Private Cloud Platform) Give Me?

Essentially, all the same capabilities you get (and expect) of a public cloud provider (AWS, Azure, GCP鈥ny of the hyperscalers). It is a platform to run VMs, containers, K8s, workloads, VPC networking constructs, monitoring, troubleshooting tools, and automation/self-service you can build. Also included is logging capabilities, insights into your network traffic flows, workload mobility, SSO, etc.

What Makes Up The VCF 鈥淎pplication鈥?

Let鈥檚 list this out (you鈥檒l see VCF in front of a bunch of the products that you might remember as vRealize, which got rebranded to Aria, which is now prefixed with VCF). Here are the 14 鈥渃omponents鈥 that comprise VCF:

  • VCF Operations
  • VCF Operations Collector
  • VCF Operations Fleet Management
  • VCF Operations for Logs (used to be Log Insight)
  • VCF Operations for Networks (used to be Network Insight, or vRNI)
  • VCF Operations HCX
  • VCF Operations Orchestrator
  • VCF Automation
  • VCF Identity Broker (provides SSO capability)
  • vSphere Replication
  • VMware ESX
  • VMware NSX
  • VMware vCenter
  • VMware vSAN

聽to where you see all of these components if you try to download from VCF 9.0 from the Broadcom Support Portal. I know I鈥檓 linking to 9.0.0.0, the GA release, but let鈥檚 see the forest through the trees for this discussion (login required to get to this page!).

This sounds like a lot, and it is, when we (like many) compare it to what we have known for years as vSphere (which is just ESXi and vCenter).

How Do You Get This Deployed?

That might be another post, or better yet, take a 1-day workshop with us here at WEI, and we can show you HOW it gets deployed. About a week (or two, depending on the size of the committee) of planning. About a week of deployment & configuration (to do it right). A few days (to five) to polish up the rest of your new 鈥渙n-prem private cloud鈥. So, for the time being, we will just say it gets deployed鈥.

Management Domain

This initial deployment for VCF is what is called the 鈥淢anagement Domain鈥. The Management Domain runs all those products we listed out above and will then be the location where the management VMs for 鈥淲orkload Domains鈥 are expected to run…more on that later.

Does it seem like you need a lot of resources to run this full VCF stack in the Management Domain? Well, that depends on what you consider as 鈥渁 lot of resources鈥濃

  • Total vCPUs allocated: 234 vCPU
  • Total RAM allocated: 825-GB RAM
  • Total Storage allocated: 15.5-TB
  • Total Storage consumed: 4-TB

鈥nd this is with the smallest deployable VM sizing available via VCF-Installer process. Ask us for the RV tools export of a newly deployed VCF environment.

What else might your run in the 鈥淢anagement Domain鈥? Forgetting that running Windows Server and/or Red Hat VMs requires licensing鈥

  • Domain Controllers
  • IdP connectors
  • Backup Servers
  • Security workloads

鈥nd other backend functions鈥ut don鈥檛 overdo it. This Management Domain will have other things to run.

This Management Domain is running 25 new VMs to start. You see the resources (listed above) those VMs will require. You see all the different components listed earlier that are integrated together鈥nd we want to do it right the first time, because if you can鈥檛 do it right to the first time, when will you find the time to fix it later? My advice:

  • Start with 4 x new ESXi servers running vSAN ESA (requires NVMe drives).
  • Brand new, or (very modern) repurposed vSAN ESA Ready Nodes, but they WILL be wiped as part of this process.
  • We will deploy VCF together on those new servers and create the Management Domain.
  • Could you use FC (not FCoE) or NFS? Sure, but given the small cost of a few NVMe drives to run vSAN ESA, we can isolate this 鈥淰CF Application鈥 and guarantee the resources required to run our enterprise application, VCF. Plus, it is recommended by the vendor, VMware, to use vSAN for the Management Domain. We will repurpose your external storage when we get to the Workload Domains.

After the Management Domain is configured, we can then import your existing vCenter Servers and the clusters that they manage (and more importantly, the VMs that they run). More on that in a bit.

Taking a step back, we realize that to run VCF in a risk averse implementation, we need a new VMware Cluster of 4 x ESXi hosts running vSAN ESA to get everything deployed.

Sizing the Management Domain

As there are quite a few components deployed for VCF with 3 x VMs in a cluster, and the expectation is to have HA (High Availability) for the VMs running, you need a minimum of 4 hosts. To be redundant myself, that is a 3+1 cluster (the +1 is for the HA event, or more practically, to do maintenance without effecting production workloads).

OK fine, we can agree with 4 nodes configured as a 3+1 cluster. What about the CPU, RAM, storage & network connectivity needed?

CPU: For CPU, let鈥檚 focus on the number of vCPUs required. Do you want to oversubscribe the management cluster? You can, but remember, this is what manages your VCF stack, so heavy oversubscription is not the answer.

Should you do a 1:1 VM CPU, for each physical CPU core? I would love to see that happen, but our pocketbooks our not infinite.

OK, so do we go 2:1, or 5:1, or 10:1? For this Management Domain, I鈥檓 happy to agree to a 2:1 CPU oversubscription.

  • Let鈥檚 work with sizing based on a CPU, with 32-cores per socket.
  • Put 2 x CPUs in each ESXi host (64 cores).
  • Go with the 4-node cluster (technically 3+1 cluster) just discussed.
  • That gives me 256 total cores for the raw total鈥echnically, that鈥檚 192 cores (3 nodes + 1 for HA) usable.
  • The total vCPUs allocated to the VMs for VCF to get started is 234 vCPUs鈥
  • We are already at 1.22:1 CPU oversubscription (234 / 192), and we haven鈥檛 added any other workloads or VCF functions yet.

RAM: Let鈥檚 start with 512-GB per node (I鈥檇 really prefer 1-TB per node, but let鈥檚 start here, just for the math). That gives you 2-TB of RAM for the raw total. But technically its 1.5-TB of RAM (3 nodes + 1 for HA again). And we are using 0.8-TB just to get started, and we haven鈥檛 added any other workloads or VCF functions yet.

What about memory oversubscription? I’m not a fan of that (most of us can agree that swapping RAM is a bad idea), but there is another way to get more useable RAM, and that is with NVMe Memory Tiering (add a NVMe drive to increase your 鈥淩AM鈥 installed in the host). Add in NVMe Memory Tiering, and 512-GB per ESXi host isn鈥檛 a terrible starting point.

I would recommend 1-TB per host to get started.

vSAN ESA Storage: It鈥檚 ~16-TB allocated (thank goodness for thin provisioning in vSAN!) That鈥檚 before any growth, and data ingestion, any logs, or any other snapshots or data retention, or even VM templates considered鈥o let鈥檚 add 50% of that to start…24-TB. That鈥檚 24-TB of USEABLE storage, not RAW capacity. 24-TB of RAID-1 is 48-TB RAW.

But vSAN ESA has some great storage efficiency (writes via RAID 1, and depending on the number of ESXi hosts in the cluster鈥.cold data at RAID 5 or 6) and global deduplication is coming soon as well.

So, 48-TB of raw capacity can get you a minimum of 24-TB useable capacity. That means each ESXi host needs to contribute 12-TB of RAW disk capacity. That鈥檚 3 x 4-TB drives.

Yes, you can add more storage to each node in the future (be sure to select hardware ready to do that).
鈥nd don鈥檛 forget to add another NVMe drive for Memory Tiering鈥(typically a different part number than the ones used for vSAN).

Networking (physical NICs): Pretty easy for most of us. We want redundant networking that meets the minimum requirements set forth by our application vendor. 2 x 25-GB NICs.

25-GbE has been around since 2016, and affordable as a ToR (Top of Rack) solution since 2019. Nearly every server today ships with 10/25-GbE NICs onboard. Plus, it is recommended by our VCF 鈥淎pplication鈥 vendor, so we follow their recommendations, given that the absolute minimum is 10-GbE. Latency must also be < 1ms. 聽is here.

Can you use more than 2 NICs per host? Yes, and you might do that to separate storage or NSX network traffic. We can discuss it, of course, though I hedge my bets for the Management Domain to have a pair of 25-GbE for most folks.

Summary of Management Domain Sizing

You need 4 x ESXi servers ready for vSAN ESA, each configured with:

  • 2 x 32-core CPUs
  • 1-TB RAM
  • 3 x 4-TB NVMe drives (for vSAN ESA)
  • OS boot Drive (Another NVMe, only needs 128-GB minimum)
  • 2 x 25-GbE NICs

Optional, but highly recommended: 1 x 4-TB NVMe for Memory Tiering. This is what is needed to run the VCF 鈥渁pplication鈥, while minimizing risk, delivering an acceptable SLA for performance & recovery, and providing the ability to scale out or up.

But Aren鈥檛 There Minimal Deployments?

Yes, there are. I suggest you access the . Quoted right from the documentation linked above鈥

鈥淭his Design Blueprint can be used as a full end-to-end design for a VMware Cloud Foundation platform or as a starting point and adjusted to suit your specific objectives by substituting any of the design selections listed below with alternative models.鈥

This is a great starting point to build a lab or demo environment in getting yourself familiar with VCF capacities and features. However, it is not a recommended way to implement something that is delivering mission critical capabilities for the business.

And you still need about 45% of the resource we discussed earlier when we discussed the Management Domain. You are not deploying everything that you have purchased to help you run a private-cloud.

Let鈥檚 say we do this minimum deployment…we are adding risk, with high impact scenarios that can play out in production. Well, what if we add the availability after the fact? I鈥檒l bring up that quote again ,鈥溾f you don鈥檛 have time to do it right, when will you have time to fix it?鈥

This design has the application VMs (VCF Automation, VCF Operations, and NSX) that are typically spread out as 3 x VMs, now running as a single VM each. While they do function, they are not truly available and add many single points of failures to the applications they serve, which essentially adds risk to your VCF created private cloud. Yes, they benefit from vSphere HA (which we have had since 2006 with Virtual Infrastructure 3), but that is not the way these applications were designed to run.

This minimal deployment design uses a cluster that is shared for Management Domain functions as well as any VM workloads that you see fit to mix with the Management Domain. We will call it a Consolidated Domain model (the language used in VCF release prior to 9.0). This will work, yes, but it is not what we expect from any of our applications that drive the business. Minimizing risk is a one of the things I have focused on in my 30+ years of working in IT.

鈥ut the design docs you just linked to say it can be used that way! That is true, but it does not explain that you now need to take outages, additional work, and have limited options when you do updates, patches, or upgrades in the future鈥.all things that are required in the lifecycle of IT any infrastructure component or solution.

Imagine us having this discussion if rolling out SAP, Oracle E-Business Suite, or PeopleSoft. We wouldn鈥檛 want to introduce unnecessary risk for those applications, right?

Reuse Existing vSphere Environment

ABSOLUTELY!鈥ust not for the Management Domain. We still need to run the VMs that are running on our existing vSphere environments, right? That environment isn鈥檛 going away anytime soon. We will end up running each of your existing vCenter Servers as a 鈥淲orkload Domain鈥 (explanation coming soon, I promise).

So long as the server hardware is supported to run ESXi 8.x or 9.x. (vSphere 7 support ended October 2025).

Do I have to use vSAN? No, but you can use vSAN if you would like (or need) to. You can use your existing NFS, FC, FCoE, or iSCSI SANs without issue. If you are using vVols, be aware that in vSphere 9, support is deprecated and vVols will be going away soon, so I would prefer to help you migrate off vVols at this time, rather than later.

What about my vCenter Server(s)? While possible to use vCenter 8, we would recommend upgrading that to vCenter 9. Yes, if vCenter is at version 9, you can still manage ESXi 8.x hosts. We will bring in those existing environments and make them part of your new VCF application.

Then we can take advantage of all the capabilities that VCF brings, most importantly, rightsizing your environment (as licensing CPU cores for no reason can be expensive). That means sizing your VMs as well as your physical servers running ESXi, so that we can optimize your resources so that they better align with the business outcomes defined and needed by your organization.

Workload Domains

While there is only going to be (in nearly all cases) a single Management Domain that is focused on providing VCF functions, management, and capabilities, Workload Domains are very different, but instantly familiar to us.

Essentially, a Workload Domain is very similar to what we are used to, if we think about any of our vSphere environments (any that are version 8 or earlier). It is a vCenter, and an NSX implementation, that runs the VMs that power the applications that our business needs.

Any Workload Domain is going to run the VMs that are currently running. THIS聽is where we can repurpose existing ESXi hosts and existing storage you have.

That鈥檚 it! Workload Domains are very flexible in how we create or import them. We can use storage other than vSAN (though you can still use vSAN here if you鈥檇 like).

What鈥檚 the difference between deploying a new Workload Domain, or importing an existing vCenter into VCF as a Workload Domain? The process to deploy versus import. That鈥檚 it.

So why the separation of duties like this? That鈥檚 just how Broadcom created VCF to work, so I just play by the rules provided me. Now, I like the separation of Management from Workload. Matter of fact, I鈥檝e been doing that in my designs since 2009, and many designs of those designs in their 4th or 5th generation now, all well before the Broadcom acquisition and what is now VCF 9.

Since the 鈥淢anagement鈥 of your Workload Domain is vCenter and the 3 x NSX Control VMs鈥uess where they run? The Management Domain! Yes, even if we import the existing vCenter that is running on your existing cluster, that鈥檚 where we should migrate it to.

Are there no other VMs needed to support the Workload Domain? Yes, there are, but they are all already running in the Management Domain.

So, creating (or importing) a Workload Domain requires additional resources in the Management Domain:

  • Total vCPUs allocated: 44 vCPU
  • Total RAM allocated: 174-GB RAM
  • Total Storage allocated: 2-TB
  • Total Storage consumed: 0.5-TB

Sizing the Workload Domain: Well, what about the sizing? There are expectation to have HA (High Availability) for the VMs running鈥ou need a minimum of 3 hosts. To be redundant myself (again,鈥unny!)鈥.that is a 2+1 cluster (the +1 is for the HA event, or more practically, to do maintenance without effecting production workloads).

What about sizing the CPU, RAM, and Storage (3-tiered or vSAN ESA)? That will vary with each Workload Domain鈥檚 Cluster. That鈥檚 right, every Workload Domain can have up to 400 x VMware Clusters, each up to 64 ESXi hosts. That鈥檚 a lot of resources being managed by just 1 vCenter Server.

Sizing a VMware Cluster

We have all been sizing VMware vSphere Clusters since 2006. The sizing exercise we went through earlier for the Management Domain happens in almost environmen, but quite often I see the following situation play out.

Time to refresh the VMware Infrastructure, so let鈥檚 size it to run the current workload and 25% additional growth for the next 3 years. Five years later, we realize we are running 400% of the planned workload, and wondering why performance of our most critical app is suffering. Good thing we will have the tools available to us to help us with that moving forward鈥

How do you break out each VMware Cluster, or better said, size each VMware Cluster? I would take the same approach I took above for sizing the Management Domain.

What Design Qualities are most important for THAT SPECIFIC workload? Availability, Manageability, Performance, Recoverability, Scalability, or Security? How do we prioritize those Design Qualities for THAT VMware Cluster?

鈥nd we will do that for each of the components that make up your VMware Cluster:
Compute, Storage, Networking, Management, Workloads, Analytics, Chargeback, Reporting, and of course, Compliance.

Just like building a VMware Cluster dedicated to MS SQL or Oracle, you plan your workload requirements, size accordingly, and run it. Extra capacity? Let鈥檚 put other VMs on that VMware Cluster for Oracle…NOPE! That was designed a specific way for a specific purpose. That extra capacity is there for a reason, not to be consumed on a whim by something that is not running Oracle.

Questions? or fill out the Contact Us form here at wei.com.