EHC 4.1.1 Scalability and Maximums

Architecting large scale cloud solutions using VMware products have several maximums and limits when it comes to scalability of the different components. People tend to look at only vSphere limits, but the cloud also has several other systems with different kind of limits to consider. In addition to vSphere, we have limits with NSX, vRO, vRA, vROps and the underlying storage. Even some of the management packs for vROps have limitations that can affect large scale clouds. Taking everything into consideration requires quite a lot of tech manual surfing to get all the limitations together. Let’s inspect a maxed out EHC 4.1.1 configuration and see where the limitations are.

EHC_scalability_larger

The above design contains pretty much everything you can throw a VMware based cloud design at. The design is based on EHC, so some limitations are from internal design choices, but almost all of the limitations are relevant to all VMware clouds.

Start from the top. vRA 6.x and 7.1 can handle 50 000 VMs. vRA 7.3 goes up to 75 000 VMs, but EHC 4.1.1 uses vRA 7.1. Enough for you? Things are not that rosy I’m afraid. Yes, you can add 50 000 VMs under vRA management and it will work. It’s the underlying infrastructure that is going to cause you some grey hair. Even vRealize Orchestrator cannot support 50 000 VMs. Instead, it can handle either 35 000 VMs in standalone install, or 30 000 VMs in a cluster mode with 2-nodes. Cluster mode is the standard with EHC, so for our design, 30 000 VMs is a limitation. This limit only affects if your VMs are under vRO’s management, for example they are utilizing EHC Backup-as-a-Service. You could have VMs outside of vRO of course, so in theory you could also achieve the max VM count. Additional vRO server is an option, but for EHC, we use only one instance for our orchestration needs. Anything beyond that is outside of the scope of EHC.

Next let’s look at the vCenter blocks of our design. A single vCenter can go up to 10 000 powered on VMs, 15 000 in total. So just slap 5 of those under vRA and good to go, right? Wrong! There are plenty of other limiting factors like 2048 powered on VMs per datastore with VMware HA turned on, but also things like 1000 VMs per ESXi host and 64 hosts per cluster. These usually won’t be a problem. With EHC, you can have maximum of 4 vCenters with full EHC Services and 6 vCenters outside of EHC Services. You can max out vRA, but you can only have EHC Services for 40 000 VMs. When we take vRO into account, the limit drops to 30 000 VM. You can still have 20 000 VMs outside of these services on other vCenters, no problem.

vCenter_block

Inside the vCenter block we have other components besides just vCenter. NSX follows the vSphere 6 limits, so it doesn’t cause any issues. NSX Manager is mapped 1:1 with vCenter, so single vCenter limits apply. You can add multiple vCenters to vRA, so overall limit will not be lowered by NSX. In addition to NSX, we have two collectors for monitoring, Log Insight Forwarder and vROps Remote Collector. Both have some limitations, but they don’t affect the 10 000 VM limit for the block.

As always, storage is a big part of infrastructure design. Depending on your underlying array and replication method, you might not achieve the full 10 000 VMs from vCenter. For example, vSAN can only have one datastore per cluster. As said before, with the combination of HA, the limit per cluster is 2048 powered on VMs with older vSAN versions. However, this limit doesn’t apply to vSAN 6.x anymore. Now the maximum for vSAN cluster is 6400 VMs, and all can be powered on. You can also have only 200 VMs per host with vSAN based solutions. On a normal cluster the limit is 1024. If you use a vSAN based appliance such as Dell EMC VxRail, the vCenter limit drops to 6400 VMs since you can only have one cluster and one datastore.

vCenter_with_VxRail

You most likely want to protect your VMs across sites. There are two methods for this with EHC: Continuous Availability (aka VPLEX/vMSC) and Disaster Recovery (aka RP4VM). The first option, EHC CA, doesn’t limit your vCenter maximum. VPLEX follows vCenter limits the same way as NSX does. EHC supports 4 vCenters with VPLEX, so that brings the total of CA protected VMs to 40 000 VMs. Again, vRO limits your options a bit to 30 000 VMs, and yes, you can have VMs outside of VPLEX protection in a separate cluster and separate vCenters. You could have 4 vCenters with 30 000 protected VMs in total with VPLEX, and on top of that 20 000 VMs outside of EHC.

vCenter_with_VPLEX

For EHC DR, the go-to option is to use RecoverPoint for VMs. RP4VM does not use VMware SRM, but it has its own limits. The maximum for a vCenter pair is 2048 VMs with RP4VM 4.3. These limits will grow with the upcoming RP4VM 5.1 release later this year. You can have two vCenter pairs in EHC with RP4VM, so then the total protected VMs would be 4096. You can have both replicated and non-replicated VMs in the same cluster, so the overall limit is not affected beyond vRO. We do support physical RecoverPoint appliances with VMware SRM as well. SRM can support up to 5000 VMs, and you can use SRM in 1 vCenter pair only. You can have non-replicated clusters with replicated ones, so the overall limit can still be high. With the combination of RP4VM and SRM, you could have up to 7048 protected VMs between 2 vCenters and 2048 protected VMs between 2 other vCenters, so in total 9096 DR protected VMs in the system.

vCenter_with_DR

In addition to replication, backup is crucial as well. Backup design can have interesting side affects. Avamar doesn’t have a fixed VM limit, since ingesting backup data doesn’t have much to do with VM count, but data change rate does. Backup system limit has to be calculated using backup window, amount of backup proxies and the data change rate. You can have up to 48 proxies associated with an Avamar grid. Each proxy can backup/restore simultaneously 8 VMs, so total is 384 VMs. This limit is not fixed, but it’s not recommended to change it. So any given moment, you can backup 384 VMs. If your backup windows is 8 hours, and 1 VM takes 10 minutes to backup, your maximum is 18432 VMs inside the backup window (assuming all 384 VMs start and finish during 10 minutes). There’s a lot of assumptions in the calculations, so be careful when designing the backup infrastructure. You can obviously have many Avamar grids if needed.

Avamar

If you thought that was complex, wait until we get to the monitoring block. You wouldn’t think that monitoring is a limiting factor, but you would be wrong. There are some interesting caveats that should be at least known and taken into consideration. Obviously the platform limits are what really counts, but monitoring is a huge part of a working cloud environment. Log Insight doesn’t really have VM limitations. It only cares about incoming events (Events Per Second, EPS). There is a calculator out there to help with the sizing. You can connect up to 10 vCenters, 10 Forwarders, 1 vROps and 1 AD among other things to a single Log Insight instance. Our design uses Log Insight Forwarders to gather data from vCenters and ship it to a main cluster.

Monitoring

vROps is another matter. Whereas the vROps cluster can ingest huge amounts of VMs (120 000 objects with maximum config), the Management Packs can become a bottle neck. vRealize Automation Management Pack can handle 8000 VMs when using vRA 7.x, and 1000 VMs with vRA 6.2. That’s quite a lot less than the 50 000 VMs vRA can support. It would be nice to have all these VMs monitored, right? NSX Management Pack also has a limitation of just 2000 NSX objects, but they also say that this is the testing limit and it will work beyond 2000 VMs and 300 edges. This is probably true with vRA Management Pack as well, but it is not stated in the docs.

Finally, vRealize Business for Cloud adds another limit to the mix. It can handle up to 20 000 VMs across 4 vCenters. Again, this will limit the overall amount of VMs in the system, if all of them need to be monitored. Unfortunately there is no way to exclude some the VMs in vRA, all are monitored by vRB. You can opt out to leave some vCenters outside of vRB monitoring. Combining this limit with others in this post, the total limit comes down to 20 000 VMs, and even lower if you want them to monitored by vROps. There are ways to go beyond the limits by just not monitoring all of the vCenters or adding more VMs than is supported and taking a risk. The last part is not recommended of course.

As you can see, the limitations are all around us. You are golden up to 2000 VMs, but after that you really need to think what you need to accomplish and do some serious sizing. Well, maybe a bit before that..

EHC 4.1.1
Component VM Limitation vCenter Limitation Other Source
vCenter 6.0 U2 10 000 VMs (Powered On)
15 000 VMs (Registered)
8000 VMs per Cluster
2048 Powered On VMs on single Datastore with HA
64 ESXis per Cluster
500 ESXis per DC
1000 ESXi hosts
vSphere 6 Configuration Maximums
vRA 7.1 50 000 VMs
75 000 VMs (vRA 7.3)
1 vRO instance per tenant (XaaS limitation) EHC: 1 tenant allowed with EHC Services vRealize Automation Reference Architecture
vRO 7.1 35 000 VMs
15 000 VMs per vRO Node in Cluster Mode
30 vCenters Single SSO domain vSphere 6 Configuration Maximums
NSX 6.2.6 vCenter limits 1 vCenter per 1 NSX Manager vSphere 6 Configuration Maximums
vROps 6.2.1 120 000 Objects (with fully loaded vROps, 16 Large nodes) 50 vCenter Adapter instances
50 Remote Collectors
VMware KB 2130551
Log Insight 3.6 No VM limitations, only Events Per Second matter 10 vCenters 10 Forwarders, 1 AD, 2 DNS Servers, 1 vROps Log Insight Administration Guide
Log Insight Calculator
vSAN 6.2 200 VMs per Host
6400 VMs per Cluster
6400 Powered On VMs
64 Hosts per Cluster 1 Datastore per Cluster
1 Cluster per VxRail system
vSphere 6 Configuration Maximums
vSAN Configuration Limits
vRB for Cloud 7.1 20 000 VMs 4 vCenters vRealize Automation Administration Guide
Avamar 7.3 No fixed limit, depends on data change rate, backup windows and amount of Proxies 15 vCenters Maximum of 48 Proxies
8 concurrent backups per Proxy
Avamar 7.3 for VMware User Guide
EMC KB 411536
VPLEX / vMSC 5.5 SP1 P2 10000 Powered on VMs
15000 Registered VMs
Follows vCenter limits vSphere 6 Configuration Maximums
RecoverPoint 4.4 SP1 P1 / SRM 6.1.1 5000 VMs 1 vCenter pair allowed in EHC Can recover max 2000 VMs simultaneously VMware KB 2105500
RecoverPoint for VMs 4.3 SP1 P4 1024 individually protected VMs
2048 VMs per vCenter Pair
4096 VMs across EHC
2 vCenter Pairs in EHC
32 ESXi hosts per cluster
Recommended max 512 VMs per vSphere cluster with 4 vRPA clusters
If EHC Auto Pod is protected with RP4VM, 896 CGs left for Tenant workloads
RP4VM Scale and Performance Guide
vRA Mgmt Pack 2.2 8000 VMs (with vRA 7 / EHC 4.1.x) Mgmt pack v.2.0+ vRA Mgmt Pack Release Notes
NSX Mgmt Pack 3.5 2000 VMs
300 Edges
(will scale beyond)
Mgmt pack v.3.5+ NSX Mgmt Pack Release Notes
Advertisements

Configure Log Insight Forwarder in Enterprise Hybrid Cloud

As part of our Enterprise Hybrid Cloud, we deploy a Log Insight instance to gather the logs from the various components of the solution. Back in the days of EHC 3.5 and older, we used to have a single Log Insight appliance or a cluster, and all the syslog servers were pointed to that. Since EHC 4.0, that design has changed. Now we utilize a separate Log Insight Forwarder instance to collect and forward some of the logs. The reason behind this change is the ability of EHC 4.0 and newer to connect several remote sites (or vCenters) to one main instance of EHC. We want to collect logs from the remote sites as well, but it’s not efficient from networking perspective to collect the logs straight from the components over WAN to the main Log Insight cluster. Log Insight has a nifty built-in feature called Event Forwarding, that can push the local logs to a central location. It’s designed to work over WAN, so it can optimize the network usage and also can encrypt the traffic between sites. Pretty cool! There are plenty of other reasons to use forwarding as well.

LI_Architecture_v3

Getting the Forwarder up and running is a simple process, but it’s not that well documented in the context of an existing Log Insight cluster. The information can be found in VMware documentation, but they don’t really specify the design. First things first, the Log Insight Forwarder is a separate installation of Log Insight. Unlike vRealize Operations, you cannot deploy a “remote collector” instance of Log Insight and add that to the existing cluster. Instead, you have to do a full install of Log Insight. It can be a cluster as well, but since we use it to simply collect and push logs to central location, a single node installation is fine for our purposes. Follow the normal process of deploying the Log Insight OVA, configuring the network and launching the installation UI. Choose “New Deployment” and configure Log Insight just like you did for the main cluster.

In order to get the encrypted connection (not mandatory) to work between the Forwarder and main LI cluster, there needs to be a trust established between the two installations. To make this happen, you will need custom CA-signed certificate on the main cluster, but it should already be there for the cluster to work properly. Using self-signed is not supported when it comes to the distributed components of EHC. For the connection to work, you need to add the root certificate chain of the main Log Insight Cluster to the Forwarder keystore. Official doc for additional information.

  • Copy the trusted root certificate chain with scp or Filezilla into a temporary directory on the Forwarder instance. For example: /home
  • SSH to the forwarder instance and run the following command localhost:
     /usr/java/jre1.8.0_92/bin/keytool -import -alias loginsight -file /home/Root64.cer -keystore cacerts

    The default keystore password is changeit.
    Note: Java versions might vary with time.

  • Restart the vRealize Log Insight Forwarder instance

After the Forwarder instance is up and running, the final step is to add Event Forwarding between the Forwarder and the Cluster. Follow the docs for additional information. Navigate to Administration interface of the Log Insight Forwarder and select Event Forwarding on the left pane. Choose New Destination, fill out the Log Insight Cluster FQDN, check the Use SSL box, make sure you are using Ingestion API and press Test. You can leave the other options to default. Click Save.

LIEventForwarding_test_success

I came across a weird bug with the connection test and SSL. I had a clean Log Insight instance without anything logging to it. I configured all the steps above, and hit Test. It came back with an error “Failed connection with {LI_FQDN}:9543″. Without SSL the connection test was ok. I double checked everything and the certificates seemed fine. I tried an SSL connection by forcing an Log Insight Agent to do an SSL connection with the same root certificate chain with the appliance. This was successful, so the error seemed quite odd. I came back and hit the Test, and it was successful! It seems that if the Log Insight appliance doesn’t have any logs to forward, the Test might fail. It’s also possible that this is a certificate related issue, but I haven’t got to the bottom of it yet.

The last step is to configure the necessary agents and collect information from the local components. In the case of EHC, we divide the components according to the cluster where they are deployed. The Forwarder instance is located in the AMP or Core cluster, so we will use that instance for all the AMP component log collection. This way we can deploy additional sites with the same exact Log Insight setup on all of them.

For EHC, here’s a list of components and the associated Log Insight instance:

Forwarder:

  • VMware vSphere/vCenter
  • VMware Site Recovery Manager
  • VMware ESXi Servers from all the clusters within the site
  • VMware NSX Manager
  • VMware NSX Edges
  • VMware NSX Controllers
  • VMware NSX Distributed Logical Routers
  • VMware vRealize Operations Manager Remote Collector
  • Dell EMC Storage
  • Dell EMC RecoverPoint
  • Dell EMC RecoverPoint for Virtual Machines
  • Dell EMC Avamar
  • Dell EMC SMI-S
  • Core Microsoft SQL Server
  • Core VMware Platform Services Controller 1 & 2
  • VMware vRealize Automation Agents
  • Microsoft Active Directory (if applicable)
  • Cisco UCS

Main Cluster:

  • VMware vRealize Automation (all the components except for Agents)
  • VMware vRealize Orchestrator
  • VMware vRealize Operations Manager (all components except for Remote Collectors)
  • VMware vRealize Business for Cloud
  • Automation Pod Microsoft SQL Server
  • Dell EMC Data Protection Advisor
  • Dell EMC ViPR
  • Automation Pod VMware Platform Services Controller

Done. Time for some serious log inspection!

vRA 7.0 Reinitiate Installation Wizard

EDIT:

Well, there’s actually a CLI command to do the steps below. Just run vcac-vami installation-wizard activate, and it does everything for you. Sounds like a clean approach to me.

vra7_re_enable_wizard_4

/EDIT

vRA 7.0 comes with a nice Installation Wizard to ease the process of getting vRA and IaaS Components running. However, if you butter finger the installation process by clicking Cancel and not really reading what vRA is trying to tell you (I did that), you cannot access the installation wizard again. It’s a manual installation after that, and I’m not going to do that anymore. So, let’s fix it.

vra7_re_enable_wizard_cancel

Log into vRA appliance using the SSH client of your choice. Navigate to /etc/vcac folder. There’s a nice little file called vami.ini. The only thing it contains is this setting:

vra7_re_enable_wizard

Jackpot! Edit the file with vi, change false to true, save the file and restart vami service:
service vami-lighttp restart

Log back to VAMI at https://fqdn_of_vra:5480, and the Installation Wizard is reinitiated. If you need to close the Wizard and you don’t want to go through this hassle again, click Logout on the upper right corner.

ESXi 5.5 U3 with new E1000 drivers for Intel NUC

I’ve been running a home lab a while with a couple of Intel NUCs. They have been absolutely brilliant, but they do have a slight problem with network card drivers. ESXi 5.5 didn’t support the Intel 82579LM Ethernet Controller inside NUCs, so you had to create a custom ISO image with the correct drivers. Today I wanted to upgrade my old ESXi 5.5 image to the latest one (I’m prepping to give vRA 7.0 a go). To my happy surprise, VMware has included the necessary E1000 drivers in the ESXi 5.5 U3 (and newer) package! Oh joy, no more custom images!

I can highly recommend NUCs for homelab if the Wife Acceptance Level (WAL) on hardware needs to be high. Here’s a (old) picture of my earlier homelab with a single NUC and a ventilated cabinet. Two large fans circulate the air and they are powered from a USB port. No sound, no visual -> WAL high!

homelab 001

 

Add or Upgrade Plugins in vCenter Orchestrator Cluster Mode

Configuring vCenter / vRealize Orchestrator in a cluster mode can be tricky. There are several sources of information how to do that, including the official VMware documentation (vCenter Orchestrator 5.5.2 Documentation), so it’s not that big of a problem. Upgrading the plugins in cluster mode, however, can be challenging. There’s a procedure you have to follow if you don’t want to end up in a situation where plugins keep disappearing for no good reason.

The documentation from VMware really doesn’t cover this use case. In a normal operation, you probably have to install new plugins and upgrade the old ones from time to time. If you’ve worked with a single server install of the vCO previously, this was a simple process of uploading and installing the plugin. If you do that with a vCO in a cluster mode, it will go horribly wrong. During a customer implementation, we came up with a procedure that works and keeps the vCO operational. There is a short downtime, because the servers need to be rebooted and kept shutdown for a small period of time. The whole thing takes about 1 hour to complete, but it can be done much faster if there’s only one plugin to implement.

If you can tolerate the whole hour of downtime, I would suggest to disable the load balancer for the duration of the upgrade. If not, it can be partially up during, but obviously there’s a risk that workflows get suspended during server reboots. We kept it up but no one was allowed to use the portal.

The key is to keep only one of the vCO nodes active at a time. Basically you shutdown vco2, upgrade/install vco1, shutdown vco1 and upgrade/install vco2. Follow this procedure to guarantee a successful install:

  1. Snapshot the vCO VMs
  2. Disable the load balancer leg for vCO node vco2
  3. Shutdown vCO node vco2
  4. Open vCO configuration page for vCO node vco1 and install the new/upgraded plugin (reboot if needed)
  5. Shutdown vCO node vco1
  6. Disable load balancer leg for vco1
  7. Restart vCO node vco2
  8. Enable load balancer leg for vco2
  9. Open vCO configuration page for vCO node vco2 and install the new/upgraded plugin (reboot if needed)
  10. Shutdown vCO node vco2
  11. Disable load balancer leg for vco2
  12. Restart vCO node vco1
  13. Enable load balancer leg for vco1
  14. Verify that the new plugin was installed correctly
  15. Restart vCO node vco2
  16. Enable load balancer for leg vco2
  17. Verify that the cluster has both nodes in RUNNING state (Server Availability tab)

If you already managed to destroy the cluster by trying to install a plugin while both of the nodes were up (it happens 😉 ), recovering is not too difficult. You might want to snapshot the VMs before doing anything. Here’s how we did it:

  1. Snapshot the vCO VMs
  2. Disable the load balancer for both vCO nodes
  3. Open vCO configuration page for vCO node vco2 and Disable Cluster Mode and return to Single Server mode
  4. Shutdown vCO node vco2
  5. Open vCO configuration page for vCO node vco1 and Disable Cluster Mode and return to Single Server mode
  6. Install/fix the necessary plugins on vco1
  7. Shutdown vCO node vco1
  8. Restart vCO node vco2
  9. Open vCO configuration page for vCO node vco2 and install/fix the necessary plugins
  10. Restart vCO node vco1
  11. Enable Cluster Mode on vco1 in Server Availability tab and specify 2 active nodes
  12. Export vCO config on vco1. Go to General tab and select Export
  13. Download the config file from vco1 appliance to your desktop
  14. Open vCO configuration page for vCO node vco2, go to General tab and import the config file. Before applying, deselect the check box, we don’t want to modify network settings!
  15. Go to Server Availability tab, and verify that both nodes are visible and RUNNING. We had to restart vco1 for this to happen

Bypass Traverse Checking in vRealize Automation 6.2

This week we ran into an interesting problem during a Federation Enterprise Hybrid Cloud implementation. We had the solution implemented with VMware vRealize Automation 6.2, and everything was running smoothly. The vRA implementation was done as a distributed install, so after configuration we moved to do some vRA component failover testing. We succeeded in failing over the primary component to secondary component on all of the different VMs (vRA appliance, IaaS Web, IaaS Model Manager + IaaS DEM-O, IaaS DEM-Workers and IaaS DEM-Agents), but failback was not successful. After diving into the component logs, we found a distinctive error on almost all of them:

System.Configuration.ConfigurationErrorsException: Error creating the Web Proxy specified in the 'system.net/defaultProxy' configuration section

bypass_traverse_checking
This error was on the IaaS Model Manager, DEM-O and DEM-Agents. Rest of the components failed back just fine. The symptom was that the VMware vCloud Automation Center Service and the DEM-Orchestrator Service would not start on reboot. We also could not restart them manually, because they would fail and the same error would appear in the logs. The error points to .NET call that sets a default proxy according to the web.config file found on the Windows host (Windows\Microsoft.NET\Framework\v4.0.30319\Config). These files were not modified by us, so the error did not make a lot of sense. The web.config file also exists in some of the vRA folders, so the origin of this error was unclear. It was clear, however, that the vRA code was calling to .NET function during service start, and that call failed due to a proxy error. This lead us to a wild goose chase with VMware support for a couple of days. It became clear that the security settings or the Windows image were blocking the services to start. Since the issue only occurred after rebooting the Windows VMs, GPO seemed the prime suspect. After engaging the customer Windows/Security SME, we found the root of the problem.

Our customer runs a high security environment, so their GPO settings are very strict. The vRA manuals tells to give these rights to the IaaS Service User:

"Log on as a batch job" and "Log on as a service"

We verified these settings, and everything was according to vRA requirements. However, the customer SME found out by using the Process Explorer (https://technet.microsoft.com/en-gb/sysinternals/bb896653.aspx) that the Service User needs an extra right to local privilege called Bypass Traverse Checking. The Process Explorer actually shows that the user needs a privilege called SeChangeNotifyPolicy, but that privilege also gives user the Bypass Traverse Checking. More info on that here: http://blogs.technet.com/b/markrussinovich/archive/2005/10/19/the-bypass-traverse-checking-or-is-it-the-change-notify-privilege.aspx. After giving the user the new rights, all of the services restarted!

OpenStack Juno Lab installation on top of VMware Workstation – Prep + Nova + Compute1

I had a busy fall changing jobs, so my OpenStack installation project was put aside. I joined EMC’s Enterprise Hybrid Cloud team as a Senior Solutions Architect to participate in the development of the product. Currently we have a Federation version of the EHC GA’d (using EMC and VMware products to deliver a solid foundation for our customers to build their cloud on), but later on there will be an OpenStack version coming out as well. Because of that, OpenStack is even more relevant to me although my time right now is committed to VMware products. I can’t go into details of the upcoming OpenStack version, but any hands on knowledge is important. There will be a lot of automation (we are talking about a cloud after all!), but that does not remove the need to know how to do stuff manually.

Since summer, a new release of OpenStack, Juno, has been released, so I decided to ditch Icehouse. OpenStack is being developed at the speed of light, so much of the installation issues with previous versions have been fixed. My previous post is still relevant on the prep of the VMs if you decide to run OpenStack lab deployment as VMs on ESXi. Follow my post to create a template which you can use on different OpenStack components. Have a look at the Juno installation manual, there are less steps that are required for the base machine. Also make a decision this point if you are going with Neutron or nova-network (aka legacy networking). This will affect your network settings for the nodes.

The requirements for minimal installation with CirrOS are quite low, so we can use a base machine with 2GB of RAM for all the components (networking node only needs 512 MB). I also noticed that the current installation manual for Juno takes note on running OpenStack inside VMs, like we are doing here. The need for promiscuous mode support and disabled MAC address filtering has been noted (hurray!). Note that you only need promiscious mode enabled and mac address filtering disabled for external network! You can follow my previous post on how to do it on ESXi. Promiscuos mode is disabled by default, so it needs to be changed. MAC address forging detection and filtering are already disabled so we can leave those be. For this build, I’m actually using VMware Workstation 9. Depending if you are using Linux as your underlying OS or Windows, things differ how to enable promiscuous mode. I’m running Windows 7, so all I need to do is enable promiscuous mode in the vmx file of my VMs. When using Workstation on Windows, promiscuous mode should be enabled by default. Just to make sure and to avoid issues later, let’s edit the vmx file and add this line:

ethernet0.noPromisc = "false"

This will enable promiscuous mode for eth0. More vmx tweaking can be found here (http://sanbarrow.com/vmx/vmx-network-advanced.html). If you want to be exact, you should only do it for nics that are used for external networks. I had so many issues with this using Icehouse, so I am being paranoid and I will enable it for all of my nics. Since this is lab enviroment, it doesn’t matter that much.

If you are running Linux, take a look at here:
https://pubs.vmware.com/workstation-9/index.jsp?topic=%2Fcom.vmware.ws.using.doc%2FGUID-089D2595-26C5-433B-9DA4-D2A94C63B7B5.html

After these steps you can continue installing OpenStack components using the official installation manual for Juno with Ubuntu. I won’t go into every command, because the manual is quite good. There are a few notes however, that I would like to share. First of all, OpenStack is using MariaDB nowadays. Won’t affect anything, but it was a nice surprise. PostgreSQL is also supported, by the way.qemu

The manual notes that you can enable verbose mode for all of the components. As a learning experience, I strongly recommend that you do so. Something WILL go wrong, and chatty logs are good for that. On that note, one major issue that I had with compute node was with hypervisors. KVM requires hardware assisted virtualization to work. We can enable this on our VM environment (https://communities.vmware.com/docs/DOC-8970), but that won’t save you. I had huge issues with KVM on Icehouse, and switching to QEMU helped a lot. Things might have progressed since, but for now I’m going with QEMU. After I get my setup to work, I will definitely give KVM another go. If you try KVM, make this change to your vmx file:

vcpu.hotadd = "FALSE"

That’s it, let’s start typing some commands!