Building VMware Linux VM Automation Tooling

We’ve just released an update that allows VMWare birds to be pre-configured. This allows customers to deploy them trivially at scale. Our KB article explains how to make use of it, but this post goes deeper under the hood to explore how we made it happen.

Automatically Configuring Linux VMs

The de facto standard for configuring or customising Cloud virtual machines (e.g. at AWS, GCP, Azure, etc) is cloud-init.  Cloud-init works by reading configuration data from datasources outside the VM, and applying the configuration inside the VM. A datasource is something like an attached disk or CD, or magic URL, or kernel interface, by which cloud-init can pull configuration information. After discovering the available datasources, cloud-init  fetches the configuration data from the identified datasource. These configurations are then applied during various stages of the boot process.

The Challenge

While we rely on cloud-init for some aspects of our configuration, we aim to support a wide-range of versions of VMware products, and reliable cloud-init support is not present in older versions of VMware products. The customised Linux distro we run also has its own peculiarities and we want our customisations to run at specific points in the startup, which requires more than cloud-init can offer. It meant our approach was bifurcated due to VMware limitations; for older VMware products we only support automatic network configuration, but for newer products the entire Canary can be configured (including its profile settings). 

How VMware Sphere Customises a Linux machine

Before cloud-init came onto the scene, VMware offered their own unique way to customise a machine when it first boots up. The requirement was that VMware-tools (or open-vm-tools) be installed in the VM, and this method is called “Guest OS Customization” in VMware. (In the open-vm-tools code, this is referred to a the “deployPkg” mechanism.)

Pulling back the curtain on the deployPkg magic, it’s revealed as a bit of a Rube Goldberg machine.  When you create a new VM from a template you’re presented with a dialog and, after selecting “Customize the operating system”, you are presented with the option to provide “User settings” to the VM.

(There’s an alternative approach on the  “Customize guest OS” step, where you can pre-select an existing policy or overwrite one where you get the chance to specify some DNS servers, timezone, network settings, and even run a custom script as well.  The ability to run a custom script is only possible if you actually enable it in the VMware Tools configuration.)

When your virtual machine boots up, VMware tools discovers there’s a “package” available to deploy by checking if the configuration parameter “tools.deployPkg.fileName” is set for the virtual machine. If this is set, using “magic” memory-mapped IO between the Hypervisor and the virtual machine, vmware-tools obtains a compressed package file.

At this point vmware-tools will extract this package to a temporary location inside the VM. In this package are a few important files:

  • cust.cfg — The file containing the customisation information entered by the user in the launch wizard.
  • script.bat —  The customization script (if one was supplied in the wizard).
  • scripts/ — A collection of Perl (!) scripts to customise a variety of Linux distributions (e.g. Ubuntu, Debian, RedHat, etc).

VMware Tools starts the customisation process by running customize.sh inside the package’s scripts directory.  This will in turn detect what OS it’s being executed on, run the customisation script provided, and then configure the values specified in cust.cfg.

VMware Customization using cloud-init

When the VMware Tools and cloud-init versions are new enough, it’s better to rely on cloud-init to perform the VMware customization.  This works differently than the deployPkg method above. In this case three parameters can be configured on the Hypervisor for the virtual machine: “guestinfo.metadata”, “guestinfo.userdata”, and “guestinfo.metadata”.  Cloud-init will get the values for these parameters and perform its magic during startup. However this route wasn’t open to us.

Hacking guestinfo

While it is possible for us to use the deployPkg method to configure the IP addresses of a Canary, we are limited to only the network configuration. Using cloud-init’s VMware Datasource with guestinfo is also not an option since due to the versions required for this to make it work.

During digging through how VMware managed to customise the operating system, we discovered that we can actually place any info in the parameters for the VM.  Thus drawing some inspiration from the way cloud-init uses these to customise the VM we added a couple of our own values.

So how do you access this information inside the VM?  The VMware Tools package provides the  “vmware-rpctool” tool. Using this program,  you can access the parameters setup by the Hypervisor, and even write values back or set your own. The commands are dead simple:

  • Reading a parameter value: /usr/bin/vmware-rpctool "info-get <key>"
  • Setting a parameter value: /usr/bin/vmware-rpctool "info-set <key> <value>"

Armed with a way to reliably pass any information we need between the Hypervisor and the Canary, we modified our custom Canary cloud-init module and added a few new features.

KeyStructure
guestinfo.networkBase64-encoded JSON object that configures the network settings of the Canary
guestinfo.initial_profileString value specifying one of our pre-configured personalities for the Canary
guestinfo.initial_settingsBase64-encoded JSON object that configures any of the services on the Canary.
guestinfo.autocommission_tokenSets the auto commission token, to automatically enrol the Canary with the console.

The downside to automatic configuration is that there’s not an easy way to detect if the automatic configuration has failed for some reason. For example, maybe the user specified a profile name incorrectly, all you’d see is that the Canary does not start-up as expected. In our initial use of “vmware-rpctool”, we only read the values and processed them.  We then had the idea to write the error back into another parameter. That is where the setting of values came in handy;, if a configuration value is not correct or something goes wrong, we write the error back to “<key>.error” e.g. “guestinfo.network.error” giving the user a clear indication of what went wrong (while preserving the original data).

End Result

Armed with these new parameters and tools to automatically deploy VMware machines it is  possible for a customer to deploy 100’s of VMware machines in the matter of a few minutes.

Leave a Reply

Site Footer

Discover more from Thinkst Thoughts

Subscribe now to keep reading and get access to the full archive.

Continue reading

Authored with 💚 by Thinkst