VMware Cloud Foundation 4.2 to 4.4 upgrade/relaunch (Part 1)

One of my favourite customers are in Sweden and we were introduced to them around 5 years ago by a hardware vendor to help them with day 2+ services around developing their Horizon VDI environment that was deployed during the VCF bring up (VCF 2!).

In those 5 years we’ve worked together we’ve completely rebuilt the Horizon environment outside of the VCF management function (before VMware decided to drop VDI lifecycle management) to just sit on a workload domain within VCF but to be treated like any other workload domain.

With kit refresh time upon us we were fortunate to be able to workshop our requirements together and to spec out a brand new VCF 4 (4.2 at the time it was done) solution in the new production datacentre and to undertake a complete migration from the VCF 3.x environment in the other datacentre.

‘The New’

The first thing we were keen on making sure we did was build the new design with the thought process that ultimately, we will want a second datacentre to come online and take advantage of NSX-T Federation for workload mobility, so we made sure we agreed naming consistencies, and built out a BGP framework that would allow us to build relationships later between the two environments.

The very first part of deploying a VCF solution is to deploy the Cloud Builder OVA which is used to perform the actual deployment of the VCF Management Workload Domain.

We were fortunate to have the existing environment to deploy the OVA onto, but you could easily use a temporary machine for this if needed. With COVID upon us we weren’t allowed to travel outside of the UK to Sweden so we worked in partnership with our customer to assign the out of band consoles on the new servers, and plug in the network ports whilst we configured the top of rack switching.

With the 4 x management workload domain servers racked, cabled, and the switch ports configured, we were ready to begin bring up.

The first step is to make sure the hosts were on the latest/supported firmware and the requisite ESXi build was installed to the correct build number for VCF 4.2.

https://docs.vmware.com/en/VMware-Cloud-Foundation/4.2/rn/VMware-Cloud-Foundation-42-Release-Notes.html

For VCF 4.2 we required ESXi u1d (build 17551050) which as a patch release did not have an ISO available on https://my.vmware.com

Well that’s OK we thought, because the first step normally is taking the latest hardware vendor ESXi ISO and patching it to the requisite release anyway for the best of both worlds so we did that following a process broadly aligned to the VMware prescribed process:

https://docs.vmware.com/en/VMware-Cloud-Foundation/4.2/vcf-deploy/GUID-D43A3FAC-682E-46F7-8342-03364EE5D2CC.html

With our newly baked ISO we were able to install ESXi onto the 4 x management workload domain hosts, enable SSH, enable NTP, and ensure the correct hostname and domain name were configured.

A coffee in hand, we hit ‘go’ on the Cloud Builder bring up process and it promptly failed complaining about the certificates on the ESXi hosts not matching the names we had provided in the deployment spreadsheet. 

This was easily fixed, it is caused by the default install of ESXi having a localhost SSL certificate which did not match the hostname we gave the servers. 

A quick /sbin/generate-certificates and reboot of the ESXi host (you could also restart the management agents but just incase, a reboot is often cleaner) and we could see the certificate correctly displaying in a browser for the host. A retry on the bring up allowed it to continue.

It next failed on ‘failed to validate BGP Route Distribution’, we had expected some challenges here we’d converted the TOR switch configs from Cisco Nexus to Brocade VDX and it seemed that the issue was we had not advertised the default route down from the TOR to NSX-T.

The config on our TOR switch pair for BGP looks like:

router bgp

local-as 65001

neighbor ip-of-nsxt-mgmt-edgenode01 remote-as 65003

neighbor ip-of-nsxt-mgmt-edgenode01 update-source ip-of-tor-switch

neighbor ip-of-nsxt-mgmt-edgenode01 password (bgp-password)

neighbor ip-of-nsxt-mgmt-edgenode01 soft-reconfiguration inbound

neighbor ip-of-nsxt-mgmt-edgenode02 remote-as 65003

neighbor ip-of-nsxt-mgmt-edgenode02 update-source ip-of-tor-switch

neighbor ip-of-nsxt-mgmt-edgenode02 password (bgp-password)

neighbor ip-of-nsxt-mgmt-edgenode02 soft-reconfiguration inbound

address-family ipv4 unicast

always-propagate

default-information-originate

Upon retry this worked fine and the bring up completed and we could say farewell to the Cloud Builder appliance for now (we’ll need it later to bring up the other VCF stack) and log into the newly deployed SDDC Manager.

If you’re not so lucky and would like an opinion on what could be wrong, please grab the output of /opt/vmware/bringup/logs/vcf-bringup-debug.log from the Cloud Builder VM.

Once the SDDC Manager is accessible, I always recommend to complete the basic tasks of setting up the backup and the certificate authority configuration for SSL certificates to be requested and installed by SDDC Manager onto components. The VMware documentation for this process is quite comprehensive, https://docs.vmware.com/en/VMware-Cloud-Foundation/4.2/vcf-admin/GUID-80431626-B9CD-4F21-B681-A8F5024D2375.html

Once the basic SDDC components (vCenter, SDDC Manager, and NSX-T) are configured with SSL, and crucially still function OK we can move on.

vRealize and beyond

In effect the current 4 x host management workload domain is just a 4 node VSAN cluster that can run virtual machines, but as the customer as vRealize Suite licensing we can make it much more.

Traditionally the SDDC Manager within VCF tried to own everything but VMware have realised that the vRealize Suite Lifecycle Manager (VRSLCM) which is used for lifecycle of vRealize suite is a better ally for this complex operation.

The first stage is asking the SDDC Manager to deploy the VRSLCM for us, this link is quite straight forward to follow and by the end of it the VRSLCM should be deployed and accessible. https://docs.vmware.com/en/VMware-Validated-Design/6.2/sddc-deployment-of-cloud-operations-and-automation/GUID-B955E1DE-7AFC-417E-9E48-D251F3EAEE17.html

Once VRSLCM is deployed we can use that to deploy Workspace ONE Access (I’ll forever call you VMware Identity Manager) by following this link https://docs.vmware.com/en/VMware-Validated-Design/6.2/sddc-deployment-of-cloud-operations-and-automation/GUID-E6B0AFE4-7CC9-46C7-887F-7E31C14F1B67.html

Note, pay attention to the SSL Certificate part here as SDDC Manager will not manage these certificates. SDDC Manager only manages the certificates of vCenter, SDDC Manager, and NSX-T Manager.

For vRealize Suite components we have to leverage VRSLCM which does make sense but you’d like to think VMware could have at least displayed the certificates within the SDDC Manager, or alarm on their impending expiry etc.

Once Workspace ONE Access is deployed, we can integrate the VRSLCM with it so that it is becoming our single point of authentication within vRealize Suite. Do make sure you keep the built in local admin password handy though as when (not if) the link is broken it is useful to still be able to log in to fix it.

Next it is possible to deploy components such as vRealize Log Insight, and vRealize Operations Manager which SDDC Manager will notice and offer to connect to your workload domains which is rather handy.

Our customer has here as licensing for vRealize Network Insight which is not SDDC/VCF aware but can be deployed within the VRSLCM and managed day to day in the same way which is where I’m pleased to see VMware not limiting the products to what SDDC/VCF can manage.

Leave a Reply

Your email address will not be published. Required fields are marked *

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top