Getting ready to upgrade vRA from 6.x to 7.x? If your system contains imported VMs, pay attention.
There’s a known issue with imported VMs when doing an in-place upgrade. VMware has a KB 2150515 on the issue, but you have to do the fix before you upgrade. As it happens, we noticed the KB after the upgrade was done and things were in a bad state. If you can, use the KB. If it is too late, read on.
EDIT: Updated for EHC 4.1.2
Architecting large scale cloud solutions using VMware products have several maximums and limits when it comes to scalability of the different components. People tend to look at only vSphere limits, but the cloud also has several other systems with different kind of limits to consider. In addition to vSphere, we have limits with NSX, vRO, vRA, vROps and the underlying storage. Even some of the management packs for vROps have limitations that can affect large scale clouds. Taking everything into consideration requires quite a lot of tech manual surfing to get all the limitations together. Let’s inspect a maxed out EHC 4.1.2 configuration and see where the limitations are.
EHC 4.1.1 can support up to 4 sites with 4 vCenters across those sites with full EHC capabilities. On top of that, we can connect to 6 more external vCenters without EHC capabilities (called vRA IaaS-Only endpoints). There are many things to consider when designing a multi-site solution, but one aspect is often omitted: latency. If you have two sites near each other, latency is usually not a problem. When it comes to multiple sites across continents, then we need to consider roundtrip times (RTT) between the main EHC Cloud Management Platform and remote sites very carefully. There are many components that connect over the WAN to the main instance of EHC and vice versa, and some of the components are sensitive to high latency. It’s also difficult to find exact information on what kind of latencies are tolerated. Often the manuals just state that “can be deployed in high latency environments” or something similar. Let’s try to find some common factors on how to design multi-site environments. For a quick glance of the latencies involved, scroll down to a summary table at the end of this post. For a bit more explanation, read on!
Well, there’s actually a CLI command to do the steps below. Just run vcac-vami installation-wizard activate, and it does everything for you. Sounds like a clean approach to me.
vRA 7.0 comes with a nice Installation Wizard to ease the process of getting vRA and IaaS Components running. However, if you butter finger the installation process by clicking Cancel and not really reading what vRA is trying to tell you (I did that), you cannot access the installation wizard again. It’s a manual installation after that, and I’m not going to do that anymore. So, let’s fix it.
This week we ran into an interesting problem during a Federation Enterprise Hybrid Cloud implementation. We had the solution implemented with VMware vRealize Automation 6.2, and everything was running smoothly. The vRA implementation was done as a distributed install, so after configuration we moved to do some vRA component failover testing. We succeeded in failing over the primary component to secondary component on all of the different VMs (vRA appliance, IaaS Web, IaaS Model Manager + IaaS DEM-O, IaaS DEM-Workers and IaaS DEM-Agents), but failback was not successful. After diving into the component logs, we found a distinctive error on almost all of them:
System.Configuration.ConfigurationErrorsException: Error creating the Web Proxy specified in the 'system.net/defaultProxy' configuration section