Tag Archives: openstack

OpenStack Single-Node Options

After getting a DevStack node running here, I realized that a DevStack cluster wasn’t going to be as useful as I would like. DevStack is designed for developer testing purposes, and doesn’t recover well in the event of a machine reboot.

I decided to look at some options that would lead to a cluster and, ultimately, a high-availability (HA) configuration.

Canonical conjure-up

A colleague had recommended Canonical’s conjure-up, so I decided to give the Workstation guide a try:

https://ubuntu.com/openstack/install#workstation-deployment

Initially, I hit one of three problems:

  1. neutron-gateway/0 error: hook failed: "config-changed"
    Specifically, neutron-gateway/0 didn’t like the dataport br-ex:eno2, claiming that eno2 didn’t exist. Confirmed. I looked for ways that eno2 might be expected to be generated without much luck.

    This github issue looks like it describes the problem almost perfectly (except for the name of the network interface), but the fix/correction makes no sense to me. It also looks like it was fixed in 2016, so I don’t think this is my problem.

    I took a chance at tweaking the configuration parameters. I set other options for the dataport parameter. Specifying eth1 could allow neutron-gateway/0 to complete installation, but then I typically run into another problem with neutron-api/0. I didn’t get far in investigating this one. If I’m running into these kinds of problems following what should be a straight-forward install guide, then I have no idea how deep the issues go.
  2. Machine locked up and become completely unresponsive. This was only seen on VMs (mostly 18.04), but I didn’t try much on bare metal.
  3. System setup hang: all processes were either active, blocked, or waiting — many just waiting on a machine. This was more rare.

There were regularly blocked-service errors in the conjure-up logs, leading to some confusion about the actual source of the installation error(s).

This is not a good start for Canonical. I tried their guide on 16.04 and 18.04, VMs and hardware — all with no success.

I’ve filed an issue with conjure-up: https://github.com/conjure-up/conjure-up/issues/1612

I asked a question: https://askubuntu.com/questions/1157568/how-to-install-openstack-with-novalxd-on-ubuntu-16-04

A week after filing the conjure-up issue, I’ve not heard back, and my question on askubuntu.com has gotten little attention. Other issues have been filed with exactly the same problems. This one is the most active: https://github.com/conjure-up/conjure-up/issues/1618

It seems odd to me that such simple instructions posted by a major OS vendor would be allowed to get to a state where they fail so completely.

Bottom line, I think the conjure-up OpenStack Workstation install is busted. There was a bit of irony in this struggle due to all of the “works by magic” claims with conjure-up and juju. Their magic lacks some potency.

The only good thing to come out of this was that I figured out how to interact with juju — a service that looks a bit like a Puppet/Chef/Ansible installer/manager. It looks like it has support for all kinds of different cloud types and vendors, but if those integrations work like this busted localhost install, I’m not sure that I’ll get much out of them.

There is still the conjure-up cluster installation (which was the original recommendation to me). More on that coming soon.

PackStack

I mentioned that this was an option in the first article, and I was fed-up with conjure-up, so I decided to look at OpenStack on CentOS.

I’m using CentOS 7 on a 6 core, 24 GB RAM, 128 GB disk VM.

I found this guide: https://linuxhint.com/install_openstack_centos/
The only adjustment needed was to run:

yum install -y yum-utils

so that the yum-config-manager step will work correctly.

I have tried this with both OpenStack versions rocky and stein, but not queens as indicated in the instructions. Just replace rocky or stein wherever queens is mentioned.

Success!

That’s how it should be done. Install time takes around 24 min — faster than the DevStack install.

The system survives reboots out of the box, which might make it a better candidate for more involved tests than DevStack. It would certainly behave better as a test system on hardware.

Unfortunately, when looking closer into clustering and HA configurations with packstack, it starts to look a little weaker.

Mentions

Throughout my investigation some other OpenStack options have popped up:

  • Mirantis – This is a professional OpenStack (and then some) system. It looks like these guys know what they’re doing, and they charge accordingly. There might have been a free evaluation option, but I prefer to try things that are unencumbered by licenses.
  • MicroStack – This name popped up in the AskUbuntu question. I don’t know much about it except that it doesn’t sound like it supports a high-availability configuration. Though they say it’s slated for 2020/2021.

Scripting

A quick test with some scripts written to interact with the DevStack install reveals that I had some things hard-coded to work with the DevStack install. The big adjustment I had to make was in the lookup of OpenStack service endpoints. Given the OpenStack architecture it makes sense that communication points could be different across different clusters. I wrote a bit about that here.

Next Steps

Several months into this project, I have a couple of ways to set up a demonstration installation of OpenStack on a single node, Python scripts to interact with it in some basic ways, and a technique for delivering data for project testing purposes. However, I still don’t have a system that takes advantage of the primary OpenStack capabilities: compute pooling across multiple hardware nodes. More on that to come.

OpenStack Python API Service Catalog

How do you get an OpenStack service catalog via the Python API? It took me way too long with too much code digging to figure it out, so I’ll share.

Note that the following was tested on versions rocky and stein.

The way that works for anyone that can log into the identity node:

from keystoneauth1 import loading, identity, session, exceptions

auth = identity.Password(
    auth_url,
    username=username,
    password=password,
    user_domain_id="default",
    project_domain_id="default",
    project_name=project_name )
sess = session.Session( auth=auth )

# This step was not obvious. It doesn't seem 
# to be directly doable from the Session, 
# which seems like the more obvious approach.
auth_ref = sess.auth.get_auth_ref(sess)
catalog = auth_ref.service_catalog.get_endpoints( interface="public" )
service_endpoints = {}
for s_name in catalog:
    s = catalog[s_name]
    service_endpoints[s_name] = s[0]["url"]

# Here's a dictionary of service-type -> endpoint URL
service_endpoints

This was actually the second technique discovered. The technique below was the first, but it only works for users that have admin privileges to a project and (I think) reader privileges to the project services.

from keystoneauth1 import loading, identity, session, exceptions
import keystoneclient

auth = identity.Password(
    auth_url,
    username=username,
    password=password,
    user_domain_id="default",
    project_domain_id="default",
    project_name=project_name )
sess = session.Session( auth=auth )

keyclient = keystoneclient.client.Client(
    "3.0",
    session=sess )

service_endpoints = {}
service_list = keyclient.services.list()
for s in service_list:
    endpoints = keyclient.endpoints.list( enabled=True, service=s.id, interface="public" )
    service_endpoints[s.type] = endpoints[0].url

# Here's a dictionary of service-type -> endpoint URL
service_endpoints

Note that the URLs from this technique will not include project ID’s, instead including a token to be replaced: %(tenant_id)

I have opened a question on Ask OpenStack that might eventually result in a better option: https://ask.openstack.org/en/question/123404/how-to-get-service-catalog-with-python-api/#123445

OpenStack Images and Volumes

I’m starting from where I left off in my last article here. I have a single OpenStack test node running in a virtual environment hiding behind a firewall through which SSH is the only access.

All of the following was done with DevStack commit 984c3ae33b6a55f04a2d64ea9ffbe47e37e89953, which is roughly OpenStack Stein (3.18.0). Note, this version was in development at the time of writing.

Test system details:

  • 2 core
  • 4 GB RAM
  • 80 GB disk

I continue to search for or build a set of tools that can quickly setup and teardown VMs for testing purposes. Testing via VMs requires getting a base test image started, scripted configuration, kicking off the test sequence, and pulling the results. Manual steps are a no go. Speed is preferable, but it needs to work first.

Image Exploration

Ubuntu Cloud Image – Ubuntu Server the Easy Way

Conveniently, Canonical provides a cloud-ready set of Ubuntu images here: http://cloud-images.ubuntu.com/releases/

I haven’t found a smaller version of Ubuntu anywhere else.

The VMDK image, https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.vmdk, works. The image has a virtual size of 10GB.

The first attempt to start a server with the image fails on my test system — timeout in state downloading after ~3 min. The volume eventually gets created. If you remove the instance and volume and try again, the second attempt works in less than 10 seconds. I suspect the original loaded image gets cached for reuse.

Note that the above timeout issue happens for every image that gets loaded: the first attempt times out, second attempt completes quickly if the first volume was completed constructed.

Create Your Own Ubuntu Server Image

In some cases, the Ubuntu cloud image doesn’t quite do what you need. In that case it’s nice to be able to create your own images. This describes how to do that from the ground up.

There is a writeup of how to create an Ubuntu image suitable for OpenStack here: https://docs.openstack.org/image-guide/ubuntu-image.html

I think it’s noteworthy that they don’t recommend the OpenStack interface for doing this. I opted to do this through virt-manager and KVM.

The virt-sysprep step yielded an error for me due to a missing /var/lib/urandom/random-seed file. Running the following commands on the created volume clears that up:

mkdir /var/lib/urandom
touch /var/lib/urandom/random-seed

Another step missing from the above instructions is how to prepare an image to use cloud-init when cloud-init is already active. The important missing steps are:

  • If the volume has already been booted with cloud-init:
    • Either: run dpkg-reconfigure cloud-init and remove the NoCloud data source. You may also want to remove all other data sources that you know you won’t need.
    • OR: delete the /var/lib/cloud/seed directory so that local configuration data isn’t found. The seed data might be a useful reference, so I use dpkg-reconfigure.
    • Run cloud-init clean
  • Adding the users in the /etc/cloud/cloud.cfg 'users' section doesn’t change the default user for which passwords and SSH keys get set. Do this by changing system_info: default_user: name: …

Note that once all of the above steps are done, the image will not boot again without a configuration service. That makes it a little awkward if you want to adjust the image configuration or upgrade packages. So I did this in two stages:

  1. Install the OS and relevant applications.
  2. Copy image and do the cloud-init prep.

That way I can always easily get back into the original image.

Following the above OpenStack Ubuntu image guide results in an image with a virtual size of 6GB, which is the smallest disk size acceptable by Ubuntu 18.04 Server installer.

Use:

qemu-img convert -O qcow2 <input-volume> <output-volume>

to bring the image size down to less than 3GB.

Note that it is possible to boot from OpenStack volume snapshots, so for the purposes of creating a server, they’re indistinguishable from images. A custom configuration of an Ubuntu cloud image (turned volume) can be snapshotted to function similar to the above.

Create Your Own Server Image – Naive Way

This was the first way I attempted to create a usable Ubuntu Server image, but I mention it last because it was awkward and didn’t work as well for updating the image. However, if all you have is OpenStack, this will work fine.

Creating a new image from an Ubuntu installation ISO is straight-forward from the OpenStack Horizon UI. Go to Compute -> Images, click Create Image and follow the instructions. This has to be done with an admin account to make it a public image.

After that, create a server instance using the new ISO image, create a volume large enough for the installed base (>6GB for Ubuntu 18.04 Server), attach that new volume to the installer instance, and reboot it if necessary to make the volume appear on the instance. Then go through the installation process.

Detach the volume with the installed system, destroy the installation server, and create a server that boots directly from the Ubuntu 18.04 Server volume. This will allow us to do some direct configuration of the volume before turning it into an OpenStack image. cloud-init will have been installed and initially configured by the Ubuntu 18.04 Server installer. Follow the steps in the previous section to prep cloud-init to work with OpenStack on this new image.

Creating an image from a volume is fairly straight-forward. There is an option in the OpenStack UI, Upload to Image. Make sure to specify 6GB as the minimum volume disk size.

Note: If you don’t have enough storage to make any of the above work, OpenStack doesn’t give you much, if any, warning. Neither the UI nor the cinder upload-to-image command give any notice about running out
of space. The operation just silently fails. A little more on that below.

You might hold onto the original Ubuntu 18.04 Server installation volume because iteratively creating volumes from images and images from volumes seems to cause the sizes to grow. By keeping the original volume, you can go back to the original installation to apply package updates.

Create a Desktop Image

For testing of desktop applications, it’s helpful to have Ubuntu Desktop automatically log into a user account and have an autostart script kick off the testing job.

Auto-login via the UI settings is iffy when booting via OpenStack with server configuration enabled. Sometimes it goes through to the user desktop. Most of the time it just stops at the login prompt. You can verify that auto-login is enabled and the account that is set up to login by looking at /etc/gdm3/custom.conf. Looking through the boot logs, I find in auth.log that gdm-login is failing due to:

no password is available for user.

I chalked this up to a race condition between OpenStack server configuration and the desktop boot sequence. This answer suggests an alternative way to auto-login: https://askubuntu.com/questions/1136457/how-to-restart-gdm3-with-autologin-on-18-04-and-19-04

After many testing iterations, the suggestion looks solid.

The following steps will set up an Ubuntu Desktop cloud image (note, done on KVM through virt-manager):

  • Create a 9 GB volume (smallest allowable with desktop installer)
  • Install Ubuntu Desktop 18.04 on the volume.
  • Boot into the new desktop volume
  • apt install cloud-init
  • Update /etc/cloud/cloud.cfg to the user setup during installation system_info: default_user: name: …
  • vi /etc/gdm3/custom.conf
  • Add the lines in the [deamon] section:
    TimedLoginEnable = true
    TimedLogin = <user>
    TimedLoginDelay = 1
  • Shutdown VM
  • Use qemu-img command mentioned earlier to shrink the volume.

This creates an image of virtual size 9GB and real size between 6 and 7 GB. With cloud-init all of the boot configuration gets done. Note that the user to be auto-logged in must have a password set. Otherwise the auto-login feature will fail.

On my test OpenStack environment it takes 10-30 min to boot this image to the desktop, with volume cached. Directly on KVM (the system running OpenStack), it takes less than 2 min to boot to desktop. I’ll assume for now that the multi-level virtualization isn’t helping performance and revisit it when I’m working directly on metal. This might also have been causing problems with auto-login, but I’ll leave that alone since I prefer a system that works both virtualized and raw.

On-Demand Volume Creation/Mounting

I need a quick way to wrap up local data and make it available to the remote OpenStack servers.

For example: In a KVM setup, I can rsync local data to a remote directory on the KVM server and mount that directory read-only to the VM instances. OpenStack (or at least DevStack) doesn’t appear to have that capability by default — preferring to work in terms of images (glance service) and volumes (cinder service).

The OpenStack Horizon interface doesn’t allow mounting a volume read-only (or otherwise) to multiple servers. While I’ve seen some mention of multi-mounting (and possible complications), I’m going to assume that this isn’t the OpenStack design and plan to make a volume per server that requires it.

OpenStack volumes aren’t readily cloned, requiring a snapshot or image be made. So to go this route with a volume would require creating the initial volume, snapshotting, and then creating a new volume from the snapshot on each server boot. Images can be created directly via the API. Snapshots appear to need a volume for reference. That makes the image workflow a little simpler, so I’ll start with that.

Since we’ll be dealing in volumes and not synchronizing directories, we will need a way to create volumes on the local dev system, preferably without requiring root.

The GuestFish set of tools can be used to create a volume in user-mode on Linux. However, the running kernel will have to be readable by the user of the tools. A script added to /etc/kernel/postinst.d might take care of this. Access to /dev/kvm is also preferred.

This demonstrates how to make GuestFish work: http://libguestfs.org/guestfs-python.3.html. Code to make a volume from a directory, .tar, or .zip file may be forthcoming.

On a relatively light ultrabook without access to /dev/kvm, creating a qcow2-formatted EXT4 volume of any size with GuestFish takes at least 15s.

Pushing the volume data as a new image works well. Once there, volumes can be created from the images. This seems like a slight twisting of the intent of OpenStack images, but I like that it’s clear that this data SHOULD NOT CHANGE. Also, volumes work in terms of GB whereas images appear to be any size. That made me shy away from volumes as the purpose of this function is to push data on-demand.

Python API calls to create the image:

volume_name = "Chosen Volume Name"
volume_file = "qcow-formatted-volume.qcow2"
glance = glanceclient.client.Client( <server-info-and-login-credentials> )
new_image = glance.images.create(name=volume_name, container_format="bare", disk_format="qcow2", min_disk=1)
with open(volume_file) as f:
    glance.images.upload( new_image.id, f )

At this point, I’m satisfied that I can push data to OpenStack from a dev system.

An alternative, possibly lighter technique using shared filesystems, called Manila, can be found here: https://docs.openstack.org/manila/pike/install/post-install.html

Investigation of this will have to wait for another day.

Incremental Image/Volume Update

As noted earlier, rsync to a directory on the VM host makes for a quick way to do incremental updates to data that is to be mounted to VMs. It doesn’t look like OpenStack supports anything like this with images or volumes by default, requiring instead a complete re-upload of data.

If Manila for OpenStack works well, that might be an option.

Automated Server Creation

Arbitrary volume configurations can be passed to the server creation API (Python) novaclient.client.Client(…).servers.create( …, block_device_mapping_v2=… )

Note that you must pass a None image for this to take effect. I didn’t find any direct documentation on the format of the parameter, but the API code reveals a structure of at least:

# Note that the 'volume_size' parameter takes no effect if
# 'destination_type' is 'local'.
block_device_mapping_v2 = [
    {
        'uuid': image_id,
        'source_type': 'image',
        'destination_type': 'volume',
        'volume_size': required_disk_size,
        'boot_index': 0,
        'delete_on_termination': True
    },
    {
        'uuid': other_image_id,
        'source_type': 'image',
        'destination_type': 'volume',
        'volume_size': other_image_min_disk,
        'boot_index': 1,
        'delete_on_termination': True
    }
]

source_type and destination_type are described pretty well here: https://docs.openstack.org/nova/latest/user/block-device-mapping.html

Script-based Server Removal

When creating servers via Horizon, you can set a switch to remove volumes on server destruction, which does exactly as it suggests.

Removing a server via Horizon where the volumes have been set to delete_on_termination also removes the volumes.

This is also confirmed to work when removing servers via the API
(Python) novaclient.client.Client(…).servers.delete( server )

Problems in my OpenStack Test Environment

Loss of Service Interconnectivity

If the test VM IP changes for some reason, all OpenStack services are lost. Service information is stored as IPs throughout the config files. Changing the config files and restarting the node doesn’t seem to correct the problem. In fact, the config files in /etc don’t seem to have any impact at all. This problem is possible in my DHCP-based test setup, and it hit. Fixed IP is a must.

Server Creation Timeouts

First-time creation of any image of significant enough size times out (3 min), requiring the server to be destroyed and re-created. Since servers might be created with scripts, some volume cleanup may also be necessary. The sluggishness of volume creation could just be a problem with my system, but it seems like OpenStack might manage its timeouts better. Volumes that have been cached are created within seconds.

New Problems With OpenStack

Storage Leak

It appears that the OpenStack tools can fall out of sync with the volume backing. After several rounds of creating and destroying volumes, I find that I can’t create new volumes. After investigating the messages associated with the problem, it’s usually due to a lack of space. However, I can delete everything via the OpenStack UI and the problem still doesn’t get corrected. From what I can tell, there is still space available on the host.

Running sudo vgs revealed that I had virtually no free space in the logical volume groups.

Running sudo lvscan shows all of the volumes taking up my volume space. If I remove one manually with sudo lvremove, I can create volumes again for a little while. After a few rounds of this, I still end up unable to create volumes, and I just end up rebuilding the test node.

As a test, I created an OpenStack volume, observed it in the lvscan list, removed it, and saw that it got removed from the volume group. Some volumes get left behind. I suspect that those volumes are part of a cache. I would expect a cache to perform better than this in low resource circumstances.

Disappearing Services

The stack test node has hit what looked like low-memory conditions and started swapping. I let the system go for a while and returned to remove the active servers and volumes. It initially appeared to be functioning again.

However, I didn’t realize that the configuration service was either gone or malfunctioning, so no new servers were going to get any bootup configuration data. This was particularly unfortunate because I was experimenting with getting a custom image configured.

Out-of-Space Errors

The messages presented via Horizon regarding why an operation failed are not the most helpful. Most of the time you are at least told there is an error with an operation. The messages tend not to point out the cause, but at least you have a point from which to dig. In some cases, such as uploading an image from a volume, the operation will just silently fail and then you’re left digging through logs to try to find out what happened.

Granted, a production environment would be independently monitoring available system storage, but a little more robustness around error reporting would be helpful.