I’m starting from where I left off in my last article here. I have a single OpenStack test node running in a virtual environment hiding behind a firewall through which SSH is the only access.
All of the following was done with DevStack commit 984c3ae33b6a55f04a2d64ea9ffbe47e37e89953, which is roughly OpenStack Stein (3.18.0). Note, this version was in development at the time of writing.
Test system details:
I continue to search for or build a set of tools that can quickly setup and teardown VMs for testing purposes. Testing via VMs requires getting a base test image started, scripted configuration, kicking off the test sequence, and pulling the results. Manual steps are a no go. Speed is preferable, but it needs to work first.
Image Exploration
Ubuntu Cloud Image – Ubuntu Server the Easy Way
Conveniently, Canonical provides a cloud-ready set of Ubuntu images here: http://cloud-images.ubuntu.com/releases/
I haven’t found a smaller version of Ubuntu anywhere else.
The VMDK image, https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.vmdk, works. The image has a virtual size of 10GB.
The first attempt to start a server with the image fails on my test system — timeout in state downloading after ~3 min. The volume eventually gets created. If you remove the instance and volume and try again, the second attempt works in less than 10 seconds. I suspect the original loaded image gets cached for reuse.
Note that the above timeout issue happens for every image that gets loaded: the first attempt times out, second attempt completes quickly if the first volume was completed constructed.
Create Your Own Ubuntu Server Image
In some cases, the Ubuntu cloud image doesn’t quite do what you need. In that case it’s nice to be able to create your own images. This describes how to do that from the ground up.
There is a writeup of how to create an Ubuntu image suitable for OpenStack here: https://docs.openstack.org/image-guide/ubuntu-image.html
I think it’s noteworthy that they don’t recommend the OpenStack interface for doing this. I opted to do this through virt-manager and KVM.
The virt-sysprep
step yielded an error for me due to a missing /var/lib/urandom/random-seed
file. Running the following commands on the created volume clears that up:
mkdir /var/lib/urandom
touch /var/lib/urandom/random-seed
Another step missing from the above instructions is how to prepare an image to use cloud-init
when cloud-init
is already active. The important missing steps are:
- If the volume has already been booted with cloud-init:
- Either: run
dpkg-reconfigure cloud-init
and remove the NoCloud
data source. You may also want to remove all other data sources that you know you won’t need. - OR: delete the
/var/lib/cloud/seed
directory so that local configuration data isn’t found. The seed data might be a useful reference, so I use dpkg-reconfigure
. - Run
cloud-init clean
- Adding the users in the
/etc/cloud/cloud.cfg
'users'
section doesn’t change the default user for which passwords and SSH keys get set. Do this by changing system_info: default_user: name: …
Note that once all of the above steps are done, the image will not boot again without a configuration service. That makes it a little awkward if you want to adjust the image configuration or upgrade packages. So I did this in two stages:
- Install the OS and relevant applications.
- Copy image and do the
cloud-init
prep.
That way I can always easily get back into the original image.
Following the above OpenStack Ubuntu image guide results in an image with a virtual size of 6GB, which is the smallest disk size acceptable by Ubuntu 18.04 Server installer.
Use:
qemu-img convert -O qcow2 <input-volume> <output-volume>
to bring the image size down to less than 3GB.
Note that it is possible to boot from OpenStack volume snapshots, so for the purposes of creating a server, they’re indistinguishable from images. A custom configuration of an Ubuntu cloud image (turned volume) can be snapshotted to function similar to the above.
Create Your Own Server Image – Naive Way
This was the first way I attempted to create a usable Ubuntu Server image, but I mention it last because it was awkward and didn’t work as well for updating the image. However, if all you have is OpenStack, this will work fine.
Creating a new image from an Ubuntu installation ISO is straight-forward from the OpenStack Horizon UI. Go to Compute -> Images
, click Create Image
and follow the instructions. This has to be done with an admin account to make it a public image.
After that, create a server instance using the new ISO image, create a volume large enough for the installed base (>6GB for Ubuntu 18.04 Server), attach that new volume to the installer instance, and reboot it if necessary to make the volume appear on the instance. Then go through the installation process.
Detach the volume with the installed system, destroy the installation server, and create a server that boots directly from the Ubuntu 18.04 Server volume. This will allow us to do some direct configuration of the volume before turning it into an OpenStack image. cloud-init
will have been installed and initially configured by the Ubuntu 18.04 Server installer. Follow the steps in the previous section to prep cloud-init
to work with OpenStack on this new image.
Creating an image from a volume is fairly straight-forward. There is an option in the OpenStack UI, Upload to Image
. Make sure to specify 6GB as the minimum volume disk size.
Note: If you don’t have enough storage to make any of the above work, OpenStack doesn’t give you much, if any, warning. Neither the UI nor the cinder upload-to-image
command give any notice about running out
of space. The operation just silently fails. A little more on that below.
You might hold onto the original Ubuntu 18.04 Server installation volume because iteratively creating volumes from images and images from volumes seems to cause the sizes to grow. By keeping the original volume, you can go back to the original installation to apply package updates.
Create a Desktop Image
For testing of desktop applications, it’s helpful to have Ubuntu Desktop automatically log into a user account and have an autostart script kick off the testing job.
Auto-login via the UI settings is iffy when booting via OpenStack with server configuration enabled. Sometimes it goes through to the user desktop. Most of the time it just stops at the login prompt. You can verify that auto-login is enabled and the account that is set up to login by looking at /etc/gdm3/custom.conf
. Looking through the boot logs, I find in auth.log
that gdm-login
is failing due to:
no password is available for user.
I chalked this up to a race condition between OpenStack server configuration and the desktop boot sequence. This answer suggests an alternative way to auto-login: https://askubuntu.com/questions/1136457/how-to-restart-gdm3-with-autologin-on-18-04-and-19-04
After many testing iterations, the suggestion looks solid.
The following steps will set up an Ubuntu Desktop cloud image (note, done on KVM through virt-manager):
- Create a 9 GB volume (smallest allowable with desktop installer)
- Install Ubuntu Desktop 18.04 on the volume.
- Boot into the new desktop volume
apt install cloud-init
- Update
/etc/cloud/cloud.cfg
to the user setup during installation system_info: default_user: name: …
vi /etc/gdm3/custom.conf
- Add the lines in the [deamon] section:
TimedLoginEnable = true
TimedLogin = <user>
TimedLoginDelay = 1
- Shutdown VM
- Use
qemu-img
command mentioned earlier to shrink the volume.
This creates an image of virtual size 9GB and real size between 6 and 7 GB. With cloud-init all of the boot configuration gets done. Note that the user to be auto-logged in must have a password set. Otherwise the auto-login feature will fail.
On my test OpenStack environment it takes 10-30 min to boot this image to the desktop, with volume cached. Directly on KVM (the system running OpenStack), it takes less than 2 min to boot to desktop. I’ll assume for now that the multi-level virtualization isn’t helping performance and revisit it when I’m working directly on metal. This might also have been causing problems with auto-login, but I’ll leave that alone since I prefer a system that works both virtualized and raw.
On-Demand Volume Creation/Mounting
I need a quick way to wrap up local data and make it available to the remote OpenStack servers.
For example: In a KVM setup, I can rsync
local data to a remote directory on the KVM server and mount that directory read-only to the VM instances. OpenStack (or at least DevStack) doesn’t appear to have that capability by default — preferring to work in terms of images (glance service) and volumes (cinder service).
The OpenStack Horizon interface doesn’t allow mounting a volume read-only (or otherwise) to multiple servers. While I’ve seen some mention of multi-mounting (and possible complications), I’m going to assume that this isn’t the OpenStack design and plan to make a volume per server that requires it.
OpenStack volumes aren’t readily cloned, requiring a snapshot or image be made. So to go this route with a volume would require creating the initial volume, snapshotting, and then creating a new volume from the snapshot on each server boot. Images can be created directly via the API. Snapshots appear to need a volume for reference. That makes the image workflow a little simpler, so I’ll start with that.
Since we’ll be dealing in volumes and not synchronizing directories, we will need a way to create volumes on the local dev system, preferably without requiring root.
The GuestFish set of tools can be used to create a volume in user-mode on Linux. However, the running kernel will have to be readable by the user of the tools. A script added to /etc/kernel/postinst.d
might take care of this. Access to /dev/kvm
is also preferred.
This demonstrates how to make GuestFish work: http://libguestfs.org/guestfs-python.3.html. Code to make a volume from a directory, .tar, or .zip file may be forthcoming.
On a relatively light ultrabook without access to /dev/kvm
, creating a qcow2-formatted EXT4 volume of any size with GuestFish takes at least 15s.
Pushing the volume data as a new image works well. Once there, volumes can be created from the images. This seems like a slight twisting of the intent of OpenStack images, but I like that it’s clear that this data SHOULD NOT CHANGE. Also, volumes work in terms of GB whereas images appear to be any size. That made me shy away from volumes as the purpose of this function is to push data on-demand.
Python API calls to create the image:
volume_name = "Chosen Volume Name"
volume_file = "qcow-formatted-volume.qcow2"
glance = glanceclient.client.Client( <server-info-and-login-credentials> )
new_image = glance.images.create(name=volume_name, container_format="bare", disk_format="qcow2", min_disk=1)
with open(volume_file) as f:
glance.images.upload( new_image.id, f )
At this point, I’m satisfied that I can push data to OpenStack from a dev system.
An alternative, possibly lighter technique using shared filesystems, called Manila, can be found here: https://docs.openstack.org/manila/pike/install/post-install.html
Investigation of this will have to wait for another day.
Incremental Image/Volume Update
As noted earlier, rsync
to a directory on the VM host makes for a quick way to do incremental updates to data that is to be mounted to VMs. It doesn’t look like OpenStack supports anything like this with images or volumes by default, requiring instead a complete re-upload of data.
If Manila for OpenStack works well, that might be an option.
Automated Server Creation
Arbitrary volume configurations can be passed to the server creation API (Python) novaclient.client.Client(…).servers.create( …, block_device_mapping_v2=… )
Note that you must pass a None
image for this to take effect. I didn’t find any direct documentation on the format of the parameter, but the API code reveals a structure of at least:
# Note that the 'volume_size' parameter takes no effect if
# 'destination_type' is 'local'.
block_device_mapping_v2 = [
{
'uuid': image_id,
'source_type': 'image',
'destination_type': 'volume',
'volume_size': required_disk_size,
'boot_index': 0,
'delete_on_termination': True
},
{
'uuid': other_image_id,
'source_type': 'image',
'destination_type': 'volume',
'volume_size': other_image_min_disk,
'boot_index': 1,
'delete_on_termination': True
}
]
source_type
and destination_type
are described pretty well here: https://docs.openstack.org/nova/latest/user/block-device-mapping.html
Script-based Server Removal
When creating servers via Horizon, you can set a switch to remove volumes on server destruction, which does exactly as it suggests.
Removing a server via Horizon where the volumes have been set to delete_on_termination
also removes the volumes.
This is also confirmed to work when removing servers via the API
(Python) novaclient.client.Client(…).servers.delete(
server
)
Problems in my OpenStack Test Environment
Loss of Service Interconnectivity
If the test VM IP changes for some reason, all OpenStack services are lost. Service information is stored as IPs throughout the config files. Changing the config files and restarting the node doesn’t seem to correct the problem. In fact, the config files in /etc
don’t seem to have any impact at all. This problem is possible in my DHCP-based test setup, and it hit. Fixed IP is a must.
Server Creation Timeouts
First-time creation of any image of significant enough size times out (3 min), requiring the server to be destroyed and re-created. Since servers might be created with scripts, some volume cleanup may also be necessary. The sluggishness of volume creation could just be a problem with my system, but it seems like OpenStack might manage its timeouts better. Volumes that have been cached are created within seconds.
New Problems With OpenStack
Storage Leak
It appears that the OpenStack tools can fall out of sync with the volume backing. After several rounds of creating and destroying volumes, I find that I can’t create new volumes. After investigating the messages associated with the problem, it’s usually due to a lack of space. However, I can delete everything via the OpenStack UI and the problem still doesn’t get corrected. From what I can tell, there is still space available on the host.
Running sudo vgs
revealed that I had virtually no free space in the logical volume groups.
Running sudo lvscan
shows all of the volumes taking up my volume space. If I remove one manually with sudo lvremove
, I can create volumes again for a little while. After a few rounds of this, I still end up unable to create volumes, and I just end up rebuilding the test node.
As a test, I created an OpenStack volume, observed it in the lvscan
list, removed it, and saw that it got removed from the volume group. Some volumes get left behind. I suspect that those volumes are part of a cache. I would expect a cache to perform better than this in low resource circumstances.
Disappearing Services
The stack test node has hit what looked like low-memory conditions and started swapping. I let the system go for a while and returned to remove the active servers and volumes. It initially appeared to be functioning again.
However, I didn’t realize that the configuration service was either gone or malfunctioning, so no new servers were going to get any bootup configuration data. This was particularly unfortunate because I was experimenting with getting a custom image configured.
Out-of-Space Errors
The messages presented via Horizon regarding why an operation failed are not the most helpful. Most of the time you are at least told there is an error with an operation. The messages tend not to point out the cause, but at least you have a point from which to dig. In some cases, such as uploading an image from a volume, the operation will just silently fail and then you’re left digging through logs to try to find out what happened.
Granted, a production environment would be independently monitoring available system storage, but a little more robustness around error reporting would be helpful.