Alt text

OpenStack

I was recently searching through github looking for projects that could improve my homelab. I came up dry, but I did discover OpenStack and decided exploring it would offer a rich learning experience in the realm of public cloud.

OpenStack was created by the non-profit organization OpenInfra. This means it's unlikely to ever run into a licensing rug-pull (looking at you Unity, Hashicorp, Prism) and therefore seems safe to adopt. OpenStack Overview Diagram

OpenStack is a 'cloud operating system' allowing you to control compute, network and storage via an API, Web GUI. They go beyond standard IaaS functionality to also offer orchestration, fault management, service management and high availability.

It is usinga plug-in architecture pattern, allowing you to swap in or out pieces that you want to leverage. This is a fantastic graphic of the ecosystem they offer: openstack ecosystem

There're a lot of tools here I'm unfamiliar with. After reading a few of them I realized that's because these are all specific to OpenStack.

Of interest to me is container orchestration as all my workloads are containerized. Magnum is the component of OpenStack responsible for provisioning the orchestrator. It can integrate with Kubernetes or Docker Swarm, Keystone for multi-tenant security, Neutron for network security, and Cinder for volume service. There's a CLI but it only has one command, magnum-status to check the status of upgrades.

Zun however, is an API for launching containers as an OpenStack-managed resource. You would use Magnum to provision a cluster and hand it off to a customer. Zun helps you provide CaaS (Containers as a service). There's an API for managing containers provided which has all the building blocks to build up most of what I run in my homelab.

Swift is a highly available object/blob store which is [compatible with S3]https://docs.openstack.org/swift/latest/s3_compat.html). I would certainly use this as a service to give to applications as well.

Cinder provides block storage. This is important for any applications that need a persistent data mount, which happens to be a lot of the open source stuff unfortunately. I'd prefer everything be stored in database and S3.

Speaking of database, Trove is database as a service, allowing you to provision relational and non-relational databases. Another required building-block for my homelab environment.

Keystone provides a wide range of identity services; API client auth, service discovery and multi-tenant authorization. It supports LDAP, OpenID, OAuth, SAML and SQL. Great!

Finally, Horizon provides a web GUI and Neutron provides networking.

This all sounds great, how do Install it? Looking at the Lifecycle Management Tools it becomes clear that OpenStack is a collection of applications. You can deploy it onto VM's at a performance hit, or onto bare metal as recommended.

Here is a 'conceptual' architecture diagram they have in their docs: Alt text

The diagram could be constructed from the one line description of each service. This is the 'logical' architecture diagram: Alt text

This defines data transmission and storage, and runtimes for each service.

Node Installation

I bet there's tooling out there to automate installation of this stack, but I can't think of a better way to learn about it than deploying and configuring each piece individually as the documentation has you do. I'll trust the smart people behind the project and go down that path.

Following along with the OpenStack Installation Guide, they have a requirements diagram for the reference architecture: Alt text

I will need block storage out of the gate, but can live without object storage for now.

  • Setup three Ubuntu20.04+ VMs on Proxmox matching requirements above
  • openstack-1 == controller, openstack-2 == compute node, openstack-3 == block storage

Alt text

The controller runs Identity service, Image service, Placement service, management portions of Compute, Networking, and Dashboard. It can also run SQL, Message queue and NTP.

Compute nodes run the KVM Hypervisor, and agents for networking and security.

Block storage hosts the shared file system, it uses the management network only and therefore only requires one NIC.

Now, there're two options for virtual network layout you have to choose between. There is a warning that option 1 supports less features.

Alt text Alt text

It took me a while to spot the differences in the diagrams, because the only difference is there is an added Layer 3 networking agent added in self-service. The real difference is that Option 1 is as simple as possible with primarily layer-2 (bridging/switching) services and VLAN segmentation. It also offers DHCP.

Option 2 leverages NAT and layer-3 (routing) services using overlay segmentation methods like VXLAN.

The OpenStack user can create virtual networks without the knowledge of underlying infrastructure on the data network

I suppose I'll choose option 2 - Self-service networks. They mention that performance is reduced when installing on VMs instead of onto bare metal, makes sense.

The installation guide uses password authentication to make things easier, but they point to how you should configure more secure methods.

They provide a table of passwords: Alt text

Then we have to manually configure the host networking to be compatible with their network layout: Alt text

I can place the 'Provider' network on my LAN VLAN, where services can be assessed from anywhere inside my network. I can then place the Management network in my existing Management network! This network is isolated from everything else within my network aside from one specific route for accessing management ports and interfaces. The network already has a NAT Gateway for downloading updates.

In their example, the 'Provider' network appears to be a WAN subnet. Hopefully it works fine with an RFC-1908 network CIDR.

I can add VirtIO devices to each VM supplying the VLAN tag for the subnet required. My management network does not have DHCP so I'll also need to setup static IP's and DNS records for each host. Alt text

To the controller and worker I added two interfaces: Alt text

And for the Block Storage node I added just the management interface: Alt text

I booted one node at a time and confirmed the network configuration is working as expected, making firewall rules as required. Here is the netplan file for the controller: Alt text

Each node must resolve eachother with specific hostnames as well. I wish I knew that at the start when I was setting hostnames, but it's fine an addition to hostfile is easy enough.

To each /etc/hosts I added:

10.100.0.20     controller
10.100.0.21     compute1
10.100.0.22     block1

Then we reboot all nodes.

Next the installation guide has us install and configure Chrony for NTP. On the controller, Chrony is configured to allow connections from the management network.

Now we can move onto installing the OpenStack packages. I did a quick apt upgrade to ensure everything was latest. The package page did not state which repository was the latest releasel, so I found this chart elsewhere: Alt text

I suppose I should install Yoga. I added the required repository to each node, installed the OS packages with apt install nova-compute and the client with apt install python3-openstackclient.

Next you have to create an SQL Server on the controller for the cluster to store state. Easy enough to follow their instructions of 2 commands and a config file change.

Then you have to install a message queue on the controller node...

apt install rabbitmq-server
rabbitmqctl add_user openstack RABBIT_PASS
rabbitmqctl set_permissions openstack ".*" ".*" ".*"

We also have to install memcached and finally ETCD. I'm noticing a lot of components from k8s here.

apt install etcd
vim /etc/default/etcd
# Edit bind addresses...
systemctl enable etcd
systemctl restart etcd

Now by default my Firewall is going to block all the ports and protocols that all those new services require to communicate between the nodes, but I'll deal with that later.

Service Installation

At this point, we have to now install all the OpenStack services. For my setup I'll need:

Identity Service - Keystone

Following along with the docs, we first create a database:

# On Controller
$ sudo su
$ id
uid=0(root) gid=0(root) groups=0(root)
$ mysql 
MariaDB [(none)]> CREATE DATABASE keystone;
Query OK, 1 row affected (0.001 sec)
# Replace KEYSTONE_PASS with a valid password
MariaDB [(none)]> GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'localhost' \
    -> IDENTIFIED BY 'KEYSTONE_DBPASS';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'%' \
    -> IDENTIFIED BY 'KEYSTONE_DBPASS';
Query OK, 0 rows affected (0.001 sec)
MariaDB [(none)]> exit;
Bye
$ apt install keystone
$ vim /etc/keystone/keystone.conf
...
# In token section, configure Fernet token provider
[token]
# ...
provider = fernet
# Populate DB
$ su -s /bin/sh -c "keystone-manage db_sync" keystone
# Init Fernet Key repos, running keystone as OS user keystone
$ keystone-manage fernet_setup --keystone-user keystone --keystone-group keystone
$ keystone-manage credential_setup --keystone-user keystone --keystone-group keystone
# Bootstrap the service, change ADMIN_PASS for something suitable for admin user
$ keystone-manage bootstrap --bootstrap-password ADMIN_PASS \
  --bootstrap-admin-url http://controller:5000/v3/ \
  --bootstrap-internal-url http://controller:5000/v3/ \
  --bootstrap-public-url http://controller:5000/v3/ \
  --bootstrap-region-id RegionOne
# Configure Apache Server
$ echo 'ServerName controller' >> /etc/apache2/apache2.conf 
# Restart Apache
$ service apache2 restart

The final step of initial installation is 'Configure the administrative account by setting the proper environmental variables' with a bunch of exports. I assume these should get added to the root account's .env?

$ cat << EOF >> /root/.bashrc
export OS_USERNAME=admin
export OS_PASSWORD=ADMIN_PASS
export OS_PROJECT_NAME=admin
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_DOMAIN_NAME=Default
export OS_AUTH_URL=http://controller:5000/v3
export OS_IDENTITY_API_VERSION=3
EOF

Next in the docs it describes how to create a domain, users and projects. There should already be a default domain, so for now I'm just going to use that.

To test, we request an auth token:

$ openstack --os-auth-url http://controller:5000/v3 \
  --os-project-domain-name Default --os-user-domain-name Default \
  --os-project-name admin --os-username admin token issue
Password:
+------------+----------------------------------+
| Field      | Value                            |
+------------+----------------------------------+
| expires    | 2023-09-22T16:26:15+0000         |
| id         | ID_TOKEN                         |
| project_id | 45f64608eb1e4d5181615752ba362134 |
| user_id    | 6b736944b2e246ccad34c99e07dd43e3 |
+------------+----------------------------------+

The next documentation section clarifies my question about environment variables.

We create an admin-openrc file anywhere on the system containing those variables. I'll remove them from root's profile, it did seem incorrect.

$ vim /root/.bashrc
# Remove exports
$ exit
$ id
uid=1000(matt) gid=1000(matt) groups=1000(matt),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),116(lxd),122(libvirt)
$ vim ~/admin-openrc
# Add
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_NAME=admin
export OS_USERNAME=admin
export OS_PASSWORD=ADMIN_PASS
export OS_AUTH_URL=http://controller:5000/v3
export OS_IDENTITY_API_VERSION=3
export OS_IMAGE_API_VERSION=2

They also have us create one for the Demo user which I think I skipped creating... I'll circle back and actually create those things to make sure I can fully follow along.

$ . admin-openrc
openstack domain create --description "An Example Domain" example
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | An Example Domain                |
| enabled     | True                             |
| id          | 9332162dba8b43f782957024a2eacec5 |
| name        | example                          |
| options     | {}                               |
| tags        | []                               |
+-------------+----------------------------------+

$ openstack project create --domain default \
>   --description "Service Project" service
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | Service Project                  |
| domain_id   | default                          |
| enabled     | True                             |
| id          | 69c6fc2852da45bca843c0a535ca841d |
| is_domain   | False                            |
| name        | service                          |
| options     | {}                               |
| parent_id   | default                          |
| tags        | []                               |
+-------------+----------------------------------+

$ openstack project create --domain default \
>   --description "Demo Project" myproject
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | Demo Project                     |
| domain_id   | default                          |
| enabled     | True                             |
| id          | c30e3ce3e89a4bc9becb4a4d8887ccf9 |
| is_domain   | False                            |
| name        | myproject                        |
| options     | {}                               |
| parent_id   | default                          |
| tags        | []                               |
+-------------+----------------------------------+

$ openstack user create --domain default \
>   --password-prompt myuser
User Password:
Repeat User Password:
+---------------------+----------------------------------+
| Field               | Value                            |
+---------------------+----------------------------------+
| domain_id           | default                          |
| enabled             | True                             |
| id                  | 74d9470cb64f4d6e8747267cced43047 |
| name                | myuser                           |
| options             | {}                               |
| password_expires_at | None                             |
+---------------------+----------------------------------+

$ openstack role create myrole
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | None                             |
| domain_id   | None                             |
| id          | 8c896b7b237c45d79f932c20b72c734b |
| name        | myrole                           |
| options     | {}                               |
+-------------+----------------------------------+

$ openstack role add --project myproject --user myuser myrole

Sweet, now I can create the demo user profile:

$ vim demo-openrc
# Add
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_NAME=myproject
export OS_USERNAME=myuser
export OS_PASSWORD=DEMO_PASS
export OS_AUTH_URL=http://controller:5000/v3
export OS_IDENTITY_API_VERSION=3
export OS_IMAGE_API_VERSION=2

Image Service - Glance

Following along with the docs again...

# Create a DB
$ sudo su
$ id
uid=0(root) gid=0(root) groups=0(root)
$ mysql
MariaDB [(none)]> CREATE DATABASE glance;
Query OK, 1 row affected (0.001 sec)
# Replace GLANCE_DBPASS with appropriate password
MariaDB [(none)]> GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'localhost' \
    ->   IDENTIFIED BY 'GLANCE_DBPASS';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'%' \
    ->   IDENTIFIED BY 'GLANCE_DBPASS';
Query OK, 0 rows affected (0.009 sec)

MariaDB [(none)]> exit
Bye
$ exit
$ . ~/admin-openrc
$ openstack user create --domain default --password-prompt glance
User Password:
Repeat User Password:
+---------------------+----------------------------------+
| Field               | Value                            |
+---------------------+----------------------------------+
| domain_id           | default                          |
| enabled             | True                             |
| id                  | 95326391733243a9a1dffde5e14bbf97 |
| name                | glance                           |
| options             | {}                               |
| password_expires_at | None                             |
+---------------------+----------------------------------+

$ openstack role add --project service --user glance admin

$ openstack service create --name glance \
  --description "OpenStack Image" image
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | OpenStack Image                  |
| enabled     | True                             |
| id          | ca7c32add8fa439fa2985ddceee387a2 |
| name        | glance                           |
| type        | image                            |
+-------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
  image public http://controller:9292
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | f5e5a62bd20248a59ff2a847853cd1db |
| interface    | public                           |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | ca7c32add8fa439fa2985ddceee387a2 |
| service_name | glance                           |
| service_type | image                            |
| url          | http://controller:9292           |
+--------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
>   image internal http://controller:9292
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 400908189fa547c7be485f556024a6ed |
| interface    | internal                         |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | ca7c32add8fa439fa2985ddceee387a2 |
| service_name | glance                           |
| service_type | image                            |
| url          | http://controller:9292           |
+--------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
>   image admin http://controller:9292
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 5a7a40cfa66e4ac684622653985d6049 |
| interface    | admin                            |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | ca7c32add8fa439fa2985ddceee387a2 |
| service_name | glance                           |
| service_type | image                            |
| url          | http://controller:9292           |
+--------------+----------------------------------+

I skipped adding quota's at any granularity for now. Next we install and configure glance!

$ sudo apt install glance
$ vim /etc/glance/glance-api.conf
# Configure DB access
[database]
# ...
connection = mysql+pymysql://glance:GLANCE_DBPASS@controller/glance
# Configure Identity Service Access
[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = glance
password = GLANCE_PASS

[paste_deploy]
# ...
flavor = keystone

[glance_store]
# ...
stores = file,http
default_store = file
filesystem_store_datadir = /var/lib/glance/images/

# Populate DB
$ su -s /bin/sh -c "glance-manage db_sync" glance
2023-09-22 16:45:25.311 88841 INFO alembic.runtime.migration [-] Context impl MySQLImpl.
...
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
Database is synced successfully.
# Restart service to complete
$  service glance-api restart

Verify Operation by uploading an image

# Load admin profile
$ . admin-openrc
# Download an image to test with
$ cd /tmp && wget http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img
...
2023-09-22 16:47:38 (51.8 MB/s) - ‘cirros-0.4.0-x86_64-disk.img’ saved [12716032/12716032

# Upload the image using QCOW2 Disk Format, Bare container format and public vis so all project can access it
$ glance image-create --name "cirros" \
  --file cirros-0.4.0-x86_64-disk.img \
  --disk-format qcow2 --container-format bare \
  --visibility=public

+------------------+----------------------------------------------------------------------------------+
| Property         | Value                                                                            |
+------------------+----------------------------------------------------------------------------------+
| checksum         | 443b7623e27ecf03dc9e01ee93f67afe                                                 |
| container_format | bare                                                                             |
| created_at       | 2023-09-22T17:04:42Z                                                             |
| disk_format      | qcow2                                                                            |
| id               | 379e3a82-824c-4aff-a2cb-2f65a736f4b0                                             |
| min_disk         | 0                                                                                |
| min_ram          | 0                                                                                |
| name             | cirros                                                                           |
| os_hash_algo     | sha512                                                                           |
| os_hash_value    | 6513f21e44aa3da349f248188a44bc304a3653a04122d8fb4535423c8e1d14cd6a153f735bb0982e |
|                  | 2161b5b5186106570c17a9e58b64dd39390617cd5a350f78                                 |
| os_hidden        | False                                                                            |
| owner            | 45f64608eb1e4d5181615752ba362134                                                 |
| protected        | False                                                                            |
| size             | 12716032                                                                         |
| status           | active                                                                           |
| tags             | []                                                                               |
| updated_at       | 2023-09-22T17:04:43Z                                                             |
| virtual_size     | 46137344                                                                         |
| visibility       | public                                                                           |
+------------------+----------------------------------------------------------------------------------+

$ glance image-list
+--------------------------------------+--------+
| ID                                   | Name   |
+--------------------------------------+--------+
| 379e3a82-824c-4aff-a2cb-2f65a736f4b0 | cirros |
+--------------------------------------+--------+

Awesome!

Placement Service

Once again, following the docs

# Setup DB
$ sudo su
# Replace DBPASS
$ mysql
MariaDB [(none)]> CREATE DATABASE placement;
Query OK, 1 row affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'localhost' \
    ->   IDENTIFIED BY 'PLACEMENT_DBPASS';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'%' \
    ->  IDENTIFIED BY 'PLACEMENT_DBPASS';
Query OK, 0 rows affected (0.001 sec)
MariaDB [(none)]> exit;
Bye
# Create Keystone user and endpoints
$ exit
$ . admin-openrc
$ openstack user create --domain default --password-prompt placement
User Password:
Repeat User Password:
+---------------------+----------------------------------+
| Field               | Value                            |
+---------------------+----------------------------------+
| domain_id           | default                          |
| enabled             | True                             |
| id                  | cf94d951f7ac48c3af286e98fafc61ef |
| name                | placement                        |
| options             | {}                               |
| password_expires_at | None                             |
+---------------------+----------------------------------+

$ openstack role add --project service --user placement admin

$ openstack service create --name placement \
  --description "Placement API" placement
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | Placement API                    |
| enabled     | True                             |
| id          | 61a170fa076544bc9e098257d2bee7b8 |
| name        | placement                        |
| type        | placement                        |
+-------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
  placement public http://controller:8778
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 3c2e95a34d444048a70b75109217b347 |
| interface    | public                           |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | 61a170fa076544bc9e098257d2bee7b8 |
| service_name | placement                        |
| service_type | placement                        |
| url          | http://controller:8778           |
+--------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
>   placement internal http://controller:8778
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | db220e6e0698420691d43c7cee67b947 |
| interface    | internal                         |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | 61a170fa076544bc9e098257d2bee7b8 |
| service_name | placement                        |
| service_type | placement                        |
| url          | http://controller:8778           |
+--------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
>   placement admin http://controller:8778
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | ce11964744584454b76c378b06bbd447 |
| interface    | admin                            |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | 61a170fa076544bc9e098257d2bee7b8 |
| service_name | placement                        |
| service_type | placement                        |
| url          | http://controller:8778           |
+--------------+----------------------------------+

$ sudo apt install placement-api
$ sudo vim /etc/placement/placement.conf
# Make the following edits
[placement_database]
# ...
connection = mysql+pymysql://placement:PLACEMENT_DBPASS@controller/placement
[api]
# ...
auth_strategy = keystone

[keystone_authtoken]
# ...
auth_url = http://controller:5000/v3
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = placement
password = PLACEMENT_PASS
# Populate DB
$ su -s /bin/sh -c "placement-manage db sync" placement
# Restart apache
$ service apache2 restart

Verify:

$ . admin-openrc
$  placement-status upgrade check
... value required for option connection in group [placement_database] ...
# Turns out this is an issue with the my user's ability to read the placement configs
$ sudo groupadd placement
# Relog
$ exit
$ groups
matt adm cdrom sudo dip plugdev lxd libvirt placement

$ placement-status upgrade check
+-------------------------------------------+
| Upgrade Check Results                     |
+-------------------------------------------+
| Check: Missing Root Provider IDs          |
| Result: Success                           |
| Details: None                             |
+-------------------------------------------+
| Check: Incomplete Consumers               |
| Result: Success                           |
| Details: None                             |
+-------------------------------------------+
| Check: Policy File JSON to YAML Migration |
| Result: Success                           |
| Details: None                             |
+-------------------------------------------+

Compute Service - Nova

Controller Node

Docs. Configure DB:

$ sudo su
$ mysql
MariaDB [(none)]> CREATE DATABASE nova_api;
Query OK, 1 row affected (0.001 sec)

MariaDB [(none)]> CREATE DATABASE nova;
Query OK, 1 row affected (0.001 sec)

MariaDB [(none)]> CREATE DATABASE nova_cell0;
Query OK, 1 row affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'localhost' \
    ->   IDENTIFIED BY 'NOVA_DBPASS';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'%' \
    ->   IDENTIFIED BY 'NOVA_DBPASS';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]>  GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'localhost' \
    ->   IDENTIFIED BY 'NOVA_DBPASS';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'%' \
    ->   IDENTIFIED BY 'NOVA_DBPASS';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'localhost' \
    ->   IDENTIFIED BY 'NOVA_DBPASS';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'%' \
    ->   IDENTIFIED BY 'NOVA_DBPASS';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> exit;
Bye

Configure Keystone:

$ . admin-openrc
$ openstack user create --domain default --password-prompt nova
User Password:
Repeat User Password:
+---------------------+----------------------------------+
| Field               | Value                            |
+---------------------+----------------------------------+
| domain_id           | default                          |
| enabled             | True                             |
| id                  | e207792e41964e15bd5832d3fa09b272 |
| name                | nova                             |
| options             | {}                               |
| password_expires_at | None                             |
+---------------------+----------------------------------+

$ openstack role add --project service --user nova admin

$ openstack service create --name nova \
>   --description "OpenStack Compute" compute
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | OpenStack Compute                |
| enabled     | True                             |
| id          | 37eeee41dd6e441496bc8fd814de1439 |
| name        | nova                             |
| type        | compute                          |
+-------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
>   compute public http://controller:8774/v2.1
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 4deaa80f3066409682fa0027ac1a0be4 |
| interface    | public                           |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | 37eeee41dd6e441496bc8fd814de1439 |
| service_name | nova                             |
| service_type | compute                          |
| url          | http://controller:8774/v2.1      |
+--------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
>   compute internal http://controller:8774/v2.1
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 94b4665880d942f3a6169d62a699f0e3 |
| interface    | internal                         |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | 37eeee41dd6e441496bc8fd814de1439 |
| service_name | nova                             |
| service_type | compute                          |
| url          | http://controller:8774/v2.1      |
+--------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
>   compute admin http://controller:8774/v2.1
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | f9867a2714944b7cbfb738fd4fdbf951 |
| interface    | admin                            |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | 37eeee41dd6e441496bc8fd814de1439 |
| service_name | nova                             |
| service_type | compute                          |
| url          | http://controller:8774/v2.1      |
+--------------+----------------------------------+

Install and configure OS packages:

$ sudo apt install nova-api nova-conductor nova-novncproxy nova-scheduler
$ sudo vim /etc/nova/nova.conf
# Make the edits...
[api_database]
# ...
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova_api

[database]
# ...
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova
[DEFAULT]
# ...
transport_url = rabbit://openstack:RABBIT_PASS@controller:5672/
[api]
# ...
auth_strategy = keystone

[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000/
auth_url = http://controller:5000/
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = nova
password = NOVA_PASS

[service_user]
send_service_user_token = true
auth_url = https://controller/identity
auth_strategy = keystone
auth_type = password
project_domain_name = Default
project_name = service
user_domain_name = Default
username = nova
password = NOVA_PASS
[DEFAULT]
# ...
my_ip = 10.0.0.11
[vnc]
enabled = true
# ...
server_listen = $my_ip
server_proxyclient_address = $my_ip
[glance]
# ...
api_servers = http://controller:9292
[oslo_concurrency]
# ...
lock_path = /var/lib/nova/tmp
[placement]
# ...
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS

Seed Databases:

$ su -s /bin/sh -c "nova-manage api_db sync" nova
$ su -s /bin/sh -c "nova-manage cell_v2 map_cell0" nova
$ su -s /bin/sh -c "nova-manage cell_v2 create_cell --name=cell1 --verbose" nova
--transport-url not provided in the command line, using the value [DEFAULT]/transport_url from the configuration file
--database_connection not provided in the command line, using the value [database]/connection from the configuration file
e2c723e5-a876-4fdc-9826-96839f24d57b
$ su -s /bin/sh -c "nova-manage db sync" nova
# Verify nova cell0 and cell1 are registered correctly
$ su -s /bin/sh -c "nova-manage cell_v2 list_cells" nova
+-------+--------------------------------------+------------------------------------------+-------------------------------------------------+----------+
|  Name |                 UUID                 |              Transport URL               |               Database Connection               | Disabled |
+-------+--------------------------------------+------------------------------------------+-------------------------------------------------+----------+
| cell0 | 00000000-0000-0000-0000-000000000000 |                  none:/                  | mysql+pymysql://nova:****@controller/nova_cell0 |  False   |
| cell1 | e2c723e5-a876-4fdc-9826-96839f24d57b | rabbit://openstack:****@controller:5672/ |    mysql+pymysql://nova:****@controller/nova    |  False   |
+-------+--------------------------------------+------------------------------------------+-------------------------------------------------+----------+
# Finalize with restarting services
$ service nova-api restart
$ service nova-scheduler restart
$ service nova-conductor restart
$ service nova-novncproxy restart
Compute Node

This is where the rubber meets the road if you will, I'll have to fix the firewall rules to get things working here. Following the docs.

# Configure Nova we already installed, if you forgot to install it do `apt install nova-compute`
$ sudo vim /etc/nova/nova.conf
# Edit...
[DEFAULT]
# ...
transport_url = rabbit://openstack:RABBIT_PASS@controller
my_ip = MANAGEMENT_INTERFACE_IP_ADDRESS

[api]
# ...
auth_strategy = keystone

[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000/
auth_url = http://controller:5000/
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = nova
password = NOVA_PASS
[service_user]
send_service_user_token = true
auth_url = https://controller/identity
auth_strategy = keystone
auth_type = password
project_domain_name = Default
project_name = service
user_domain_name = Default
username = nova
password = NOVA_PASS
[vnc]
# ...
enabled = true
server_listen = 0.0.0.0
server_proxyclient_address = $my_ip
novncproxy_base_url = http://controller:6080/vnc_auto.html
[glance]
# ...
api_servers = http://controller:9292
[oslo_concurrency]
# ...
lock_path = /var/lib/nova/tmp
[placement]
# ...
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS

Check for hardware acceleration support for VM's

$ egrep -c '(vmx|svm)' /proc/cpuinfo
0

Apparently the way I've setup my VM, this isn't supported. I checked on my proxmox host and it's supported, but I'm not going to get stuck in this rabbit hole right now. You can set virt_type AS qemu in [libvirt] of the config file to disable acceleration.

Restart service...

$ service nova-compute restart

For now I allowed all traffic between machines on the management network: Alt text

Then I logged back into the controller node to add the worker:

$ openstack compute service list --service nova-compute
$ su -s /bin/sh -c "nova-manage cell_v2 discover_hosts --verbose" nova
sudo] password for matt:
Found 2 cell mappings.
Skipping cell0 since it does not contain hosts.
Getting computes from cell 'cell1': e2c723e5-a876-4fdc-9826-96839f24d57b
Found 0 unmapped computes in cell: e2c723e5-a876-4fdc-9826-96839f24d57b

Oh no, no compute nodes were discovered! Debugging this turned into a long process, as I've heard from reviews of OpenStack online it's all great until you need to debug something. This is where the learning gets real.

Debugging Compute Node Addition Failure

I started by looking at the nova-compute logs on the compute node:

$ tail -n 10 /var/log/nova/nova-compute.log
...
2023-09-26 19:16:35.683 36015 WARNING nova.conductor.api [req-72b0165c-c6d5-4f09-b7c3-fdd70be2df8d - - - - -] Timed out waiting for nova-conductor.  Is it running? Or did this service start before nova-conductor?  Reattempting establishment of nova-conductor connection...: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID a1614ca980ae4234a3f8caeccb111e87
... Repeating ...

nova-conductor is failing to connect. I tailed the logs of the API on the controller and restarted the service. The request is not making it to the API: Alt text

In the logs from nova-conductor on the controller there is an authentication error!

$ tail -n 25 /var/log/nova/nova-conductor.log
...
2023-09-22 18:56:37.795 115979 ERROR nova.scheduler.client.report [req-f65304a0-2516-49d1-84ba-b0f6aa4aff2c - - - - -] Placement service credentials do not work.: keystoneauth1.exceptions.http.Unauthorized: The request you have made requires authentication. (HTTP 401) (Request-ID: req-60a7d5f6-46ed-4737-ad09-6c06fcda0d6
...
2023-09-22 18:53:22.877 115209 ERROR nova     resp = session.post(token_url, json=body, headers=headers,
2023-09-22 18:53:22.877 115209 ERROR nova   File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 1149, in post
2023-09-22 18:53:22.877 115209 ERROR nova     return self.request(url, 'POST', **kwargs)
2023-09-22 18:53:22.877 115209 ERROR nova   File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 986, in request
2023-09-22 18:53:22.877 115209 ERROR nova     raise exceptions.from_response(resp, method, url)
2023-09-22 18:53:22.877 115209 ERROR nova keystoneauth1.exceptions.http.Unauthorized: The request you have made requires authentication. (HTTP 401) (Request-ID: req-4f253ff9-5916-4c11-8a96-bd792be6b7f8)

Perhaps I've misconfigured something with the placement service then... In the verify section of the Placement installation, the commands all return 503 for me so certainly something is wrong.

I tried logging into keystone via the CLI as the placement user:

$ openstack --os-auth-url http://controller:5000/v3 \
  --os-project-domain-name Default --os-user-domain-name Default \
  --os-project-name service --os-username placement token issue

The request you have made requires authentication. (HTTP 401) (Request-ID: req-06de995a-3024-46ec-baaf-7a6b283f4e5d)

Check openstack compute services:

# openstack compute service list
+--------------------------------------+----------------+-------------+----------+---------+-------+----------------------------+
| ID                                   | Binary         | Host        | Zone     | Status  | State | Updated At                 |
+--------------------------------------+----------------+-------------+----------+---------+-------+----------------------------+
| 53418943-44fc-4287-a3bf-3a2eaa134645 | nova-conductor | openstack-1 | internal | enabled | down  | 2023-09-26T19:48:22.000000 |
+--------------------------------------+----------------+-------------+----------+---------+-------+----------------------------+

This is missing three services, and the one service listed is down! Turns out, I just forgot to grant the placement user proper permissions.

$ openstack role add --project service --user placement admin

Now after a short wait...

$ openstack compute service list
+--------------------------------------+----------------+-------------+----------+---------+-------+----------------------------+
| ID                                   | Binary         | Host        | Zone     | Status  | State | Updated At                 |
+--------------------------------------+----------------+-------------+----------+---------+-------+----------------------------+
| 53418943-44fc-4287-a3bf-3a2eaa134645 | nova-conductor | openstack-1 | internal | enabled | up    | 2023-09-26T20:03:01.000000 |
| aaf68b24-a852-40f7-b784-49d6dca7001b | nova-compute   | openstack-2 | nova     | enabled | up    | 2023-09-26T20:03:08.000000 |
| 67dde661-068b-4908-bf16-9b179d1583d7 | nova-compute   | openstack-1 | nova     | enabled | up    | 2023-09-26T20:03:08.000000 |
| 9e3c0041-d01b-4e64-9da5-33f92e29a343 | nova-scheduler | openstack-1 | internal | enabled | up    | 2023-09-26T20:03:06.000000 |
+--------------------------------------+----------------+-------------+----------+---------+-------+----------------------------+

I re-ran node discovery:

$ su -s /bin/sh -c "nova-manage cell_v2 discover_hosts --verbose" nova
Found 2 cell mappings.
Skipping cell0 since it does not contain hosts.
Getting computes from cell 'cell1': e2c723e5-a876-4fdc-9826-96839f24d57b
Checking host mapping for compute host 'openstack-2': bef1a16e-48fd-4381-ab0a-1331d18bd232
Creating host mapping for compute host 'openstack-2': bef1a16e-48fd-4381-ab0a-1331d18bd232
Checking host mapping for compute host 'openstack-1': 72ada24f-8bd9-489f-a5a5-c010cb884313
Creating host mapping for compute host 'openstack-1': 72ada24f-8bd9-489f-a5a5-c010cb884313
Found 2 unmapped computes in cell: e2c723e5-a876-4fdc-9826-96839f24d57b

And verified the placement service is now working:

$  openstack --os-placement-api-version 1.2 resource class list --sort-column name+----------------------------------------+
| name                                   |
+----------------------------------------+
| DISK_GB                                |
| FPGA                                   |
| IPV4_ADDRESS                           |
...
+----------------------------------------+

And I can confirm the compute node is available now:

$ openstack compute service list --service nova-compute
+--------------------------------------+--------------+-------------+------+---------+-------+----------------------------+
| ID                                   | Binary       | Host        | Zone | Status  | State | Updated At                 |
+--------------------------------------+--------------+-------------+------+---------+-------+----------------------------+
| aaf68b24-a852-40f7-b784-49d6dca7001b | nova-compute | openstack-2 | nova | enabled | up    | 2023-09-26T20:11:18.000000 |
| 67dde661-068b-4908-bf16-9b179d1583d7 | nova-compute | openstack-1 | nova | enabled | up    | 2023-09-26T20:11:18.000000 |
+--------------------------------------+--------------+-------------+------+---------+-------+----------------------------+

Good, I can finally move on!

Network Service - Nova

Controller Node

Following the docs. Create database, Keystone credentials:

$ sudo su && mysql
MariaDB [(none)]> CREATE DATABASE neutron;                                                                 │not be supported.
Query OK, 1 row affected (0.001 sec)                                                                       │2023-09-26 20:06:29.561 3029835 WARNING nova.virt.libvirt.driver [req-1dfae277-5be6-4f71-aa79-38ce626b194e
                                                                                                           │ - - - - -] This host appears to have multiple sockets per NUMA node. The `socket` PCI NUMA affinity will
MariaDB [(none)]> GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'localhost' \                             │not be supported.
    ->   IDENTIFIED BY 'NEUTRON_DBPASS';                                                                   │2023-09-26 20:07:31.552 3029835 WARNING nova.virt.libvirt.driver [req-1dfae277-5be6-4f71-aa79-38ce626b194e
Query OK, 0 rows affected (0.031 sec)                                                                      │ - - - - -] This host appears to have multiple sockets per NUMA node. The `socket` PCI NUMA affinity will
                                                                                                           │not be supported.
MariaDB [(none)]> GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'%' \                                     │2023-09-26 20:08:31.552 3029835 WARNING nova.virt.libvirt.driver [req-1dfae277-5be6-4f71-aa79-38ce626b194e
    ->   IDENTIFIED BY 'NEUTRON_DBPASS';                                                                   │ - - - - -] This host appears to have multiple sockets per NUMA node. The `socket` PCI NUMA affinity will
Query OK, 0 rows affected (0.001 sec)
$ exit
$ cd ~ && . admin-openrc
$ openstack user create --domain default --password-prompt neutron
User Password:
Repeat User Password:
+---------------------+----------------------------------+
| Field               | Value                            |
+---------------------+----------------------------------+
| domain_id           | default                          |
| enabled             | True                             |
| id                  | 61b935a00fb042c0b54bc214a71f75a3 |
| name                | neutron                          |
| options             | {}                               |
| password_expires_at | None                             |
+---------------------+----------------------------------+

$ openstack role add --project service --user neutron admin
$ openstack service create --name neutron \
  --description "OpenStack Networking" network
+-------------+----------------------------------+
| Field       | Value                            |
+-------------+----------------------------------+
| description | OpenStack Networking             |
| enabled     | True                             |
| id          | a337d49aa69c4cb49261a6bd74d45a77 |
| name        | neutron                          |
| type        | network                          |
+-------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
  network public http://controller:9696
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 5a6d8ded53b94180b405704a6a869762 |
| interface    | public                           |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | a337d49aa69c4cb49261a6bd74d45a77 |
| service_name | neutron                          |
| service_type | network                          |
| url          | http://controller:9696           |
+--------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
  network internal http://controller:9696
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | c1b62db61dc645b8be1da960f84d4a44 |
| interface    | internal                         |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | a337d49aa69c4cb49261a6bd74d45a77 |
| service_name | neutron                          |
| service_type | network                          |
| url          | http://controller:9696           |
+--------------+----------------------------------+

$ openstack endpoint create --region RegionOne \
  network admin http://controller:9696
+--------------+----------------------------------+
| Field        | Value                            |
+--------------+----------------------------------+
| enabled      | True                             |
| id           | 9b7bbc0122e04658889b845b62b467eb |
| interface    | admin                            |
| region       | RegionOne                        |
| region_id    | RegionOne                        |
| service_id   | a337d49aa69c4cb49261a6bd74d45a77 |
| service_name | neutron                          |
| service_type | network                          |
| url          | http://controller:9696           |
+--------------+----------------------------------+

Now I continue with docs for option 2 - self-service networks.

$ apt install neutron-server neutron-plugin-ml2 \
  neutron-linuxbridge-agent neutron-l3-agent neutron-dhcp-agent \
  neutron-metadata-agent

Configure Neutron:

$ vi /etc/neutron/neutron.conf
# Make the following edits...
[database]
# ...
connection = mysql+pymysql://neutron:NEUTRON_DBPASS@controller/neutron
[DEFAULT]
# ...
core_plugin = ml2
service_plugins = router
allow_overlapping_ips = true
transport_url = rabbit://openstack:RABBIT_PASS@controller
auth_strategy = keystone
notify_nova_on_port_status_changes = true
notify_nova_on_port_data_changes = true

[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = neutron
password = NEUTRON_PASS

[nova]
# ...
auth_url = http://controller:5000
auth_type = password
project_domain_name = default
user_domain_name = default
region_name = RegionOne
project_name = service
username = nova
password = NOVA_PASS

[oslo_concurrency]
# ...
lock_path = /var/lib/neutron/tmp

Configure ML2 plug-in

$ vi /etc/neutron/plugins/ml2/ml2_conf.ini
# Make the following changes
[ml2]
# ...
type_drivers = flat,vlan,vxlan
tenant_network_types = vxlan
mechanism_drivers = linuxbridge,l2population
extension_drivers = port_security

[ml2_type_flat]
# ...
flat_networks = provider

[ml2_type_vxlan]
# ...
vni_ranges = 1:1000

[securitygroup]
# ...
enable_ipset = true

Configure the linux bridge agent:

$ vi /etc/neutron/plugins/ml2/linuxbridge_agent.ini
# Make the following changes
[linux_bridge]
physical_interface_mappings = provider:PROVIDER_INTERFACE_NAME

[vxlan]
enable_vxlan = true
local_ip = OVERLAY_INTERFACE_IP_ADDRESS
l2_population = true

[securitygroup]
# ...
enable_security_group = true
firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

Ensure linux bridge is allowed on ipv4 and ipv6. The following sysctl values should be set to 1:

$ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
$ sysctl net.bridge.bridge-nf-call-ip6tables
net.bridge.bridge-nf-call-ip6tables = 1

Configure the Layer 3 agent:

$ vi /etc/neutron/l3_agent.ini
# Change...
[DEFAULT]
# ...
interface_driver = linuxbridge

Configure DHCP

$ vi /etc/neutron/dhcp_agent.ini
# Change..
[DEFAULT]
# ...
interface_driver = linuxbridge
dhcp_driver = neutron.agent.linux.dhcp.Dnsmasq
enable_isolated_metadata = true

Configure meta-data agent:

$ vi /etc/neutron/metadata_agent.ini
# Edit
[DEFAULT]
# ...
nova_metadata_host = controller
metadata_proxy_shared_secret = METADATA_SECRET

Configure nova to use neutron:

$ vi /etc/nova/nova.conf 
# Edit..
[neutron]
# ...
auth_url = http://controller:5000
auth_type = password
project_domain_name = default
user_domain_name = default
region_name = RegionOne
project_name = service
username = neutron
password = NEUTRON_PASS
service_metadata_proxy = true
metadata_proxy_shared_secret = METADATA_SECRET

Finalize:

# Populate DB - this performed a lot of migrations for me and took a few minutes
$ su -s /bin/sh -c "neutron-db-manage --config-file /etc/neutron/neutron.conf \
  --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade head" neutron
...
INFO  [alembic.runtime.migration] Running upgrade 8160f7a9cebb -> cd9ef14ccf87
INFO  [alembic.runtime.migration] Running upgrade cd9ef14ccf87 -> 34cf8b009713
INFO  [alembic.runtime.migration] Running upgrade 7d9d8eeec6ad -> a8b517cff8ab
INFO  [alembic.runtime.migration] Running upgrade a8b517cff8ab -> 3b935b28e7a0
INFO  [alembic.runtime.migration] Running upgrade 3b935b28e7a0 -> b12a3ef66e62
INFO  [alembic.runtime.migration] Running upgrade b12a3ef66e62 -> 97c25b0d2353
INFO  [alembic.runtime.migration] Running upgrade 97c25b0d2353 -> 2e0d7a8a1586
INFO  [alembic.runtime.migration] Running upgrade 2e0d7a8a1586 -> 5c85685d616d
  OK
# Restart Nova
$ service nova-api restart
# Restart networking serivces
$ service neutron-server restart
$ service neutron-linuxbridge-agent restart
$ service neutron-dhcp-agent restart
$ service neutron-metadata-agent restart
$ service neutron-l3-agent restart
Compute Node

Install components:

$ apt install neutron-linuxbridge-agent

Configure:

$ vi /etc/neutron/neutron.conf
# Comment out any [database] settings, compute nodes do not connect directly to DB
# then...
[DEFAULT]
# ...
transport_url = rabbit://openstack:RABBIT_PASS@controller
auth_strategy = keystone

[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = neutron
password = NEUTRON_PASS

[oslo_concurrency]
# ...
lock_path = /var/lib/neutron/tmp

Then we follow the docs for self-service networks.

$ vi /etc/neutron/plugins/ml2/linuxbridge_agent.ini
# Change
[linux_bridge]
physical_interface_mappings = provider:PROVIDER_INTERFACE_NAME
[vxlan]
enable_vxlan = true
local_ip = OVERLAY_INTERFACE_IP_ADDRESS
l2_population = true
[securitygroup]
# ...
enable_security_group = true
firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

Reconfigure nova:

$ vi /etc/nova/nova.conf
# Change
[neutron]
# ...
auth_url = http://controller:5000
auth_type = password
project_domain_name = default
user_domain_name = default
region_name = RegionOne
project_name = service
username = neutron
password = NEUTRON_PASS

Finalize:

$ service nova-compute restart
$ service neutron-linuxbridge-agent restart

Dashboard Service - Horizon

Following the docs. On the controller node:

$ apt install openstack-dashboard

Edit local_settings.py:

$ vi /etc/openstack-dashboard/local_settings.py
OPENSTACK_HOST = "controller"
...
ALLOWED_HOSTS = ['*']
...
SESSION_ENGINE = 'django.contrib.sessions.backends.cache'

CACHES = {
    'default': {
         'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
         'LOCATION': 'controller:11211',
    }
}
...
OPENSTACK_KEYSTONE_URL = "http://%s/identity/v3" % OPENSTACK_HOST
...
OPENSTACK_API_VERSIONS = {
    "identity": 3,
    "image": 2,
    "volume": 3,
}
...
OPENSTACK_KEYSTONE_DEFAULT_DOMAIN = "Default"
...
OPENSTACK_KEYSTONE_DEFAULT_ROLE = "user"
...
OPENSTACK_NEUTRON_NETWORK = {
    ...
    'enable_router': False,
    'enable_quotas': False,
    'enable_ipv6': False,
    'enable_distributed_router': False,
    'enable_ha_router': False,
    'enable_fip_topology_check': False,
}

Add WSGIApplicationGroup %{GLOBAL} to /etc/apache2/conf-available/openstack-dashboard.conf if it is not already included.

Reload:

$ systemctl reload apache2.service

Loading the URL in browser, http://controller/horizon: Alt text

Logging in failes though...

Debugging Authentication Failure

First I looked for logs for Horizon, which are defined here. I read through access and error, and found the following line in error.log:

[Wed Sep 27 00:15:55.142631 2023] [authz_core:error] [pid 1439955:tid 139986120726272] [client 10.100.0.20:45748] AH01630: client denied by server configuration: /usr/bin/keystone-wsgi-public
[Wed Sep 27 00:15:55.144296 2023] [wsgi:error] [pid 1439942:tid 139986868782848] [remote 10.20.0.12:54020] INFO openstack_auth.forms Login failed for user "admin" using domain "Default", remote address 10.20.0.12.

It seems that there is a configuration error resulting in access being denied. I found this serverfault post which suggested you have to add the port in the keystone URL:

$ vi /etc/openstack-dashboard/local_settings.py
...
OPENSTACK_KEYSTONE_URL = "http://%s/identity/v3" % OPENSTACK_HOST
->
OPENSTACK_KEYSTONE_URL = "http://%s:5000/identity/v3" % OPENSTACK_HOST
# Reload
$ systemctl reload apache2.service

This time when I loaded the page, the default domain was prefilled so it sems to have connnected! However, on logging in I got this error: Alt text

I found the following in /var/log/apache2/error.log

[Wed Sep 27 00:31:47.289638 2023] [wsgi:error] [pid 1440655:tid 139986851997440] [remote 10.20.0.12:60009] neutronclient.common.exceptions.ServiceUnavailable: The server is currently unavailable. Please try again at a later time.<br /><br />
[Wed Sep 27 00:31:47.289664 2023] [wsgi:error] [pid 1440655:tid 139986851997440] [remote 10.20.0.12:60009] The Keystone service is temporarily unavailable.
[Wed Sep 27 00:31:47.289679 2023] [wsgi:error] [pid 1440655:tid 139986851997440] [remote 10.20.0.12:60009]
[Wed Sep 27 00:31:47.289693 2023] [wsgi:error] [pid 1440655:tid 139986851997440] [remote 10.20.0.12:60009]
[Wed Sep 27 00:31:47.289707 2023] [wsgi:error] [pid 1440655:tid 139986851997440] [remote 10.20.0.12:60009] Neutron server returns request_ids: ['req-e4a6d63f-5b57-4822-b527-76d14f53ec75']

I found on the neutron docs there was a sanity check tool:

$ neutron-sanity-check
2023-09-27 00:37:06.136 1441539 INFO neutron.common.config [-] Logging enabled!
2023-09-27 00:37:06.137 1441539 INFO neutron.common.config [-] /usr/bin/neutron-sanity-check version 20.3.1
2023-09-27 00:37:06.412 1441539 ERROR ovsdbapp.backend.ovs_idl.idlutils [-] Unable to open stream to tcp:127.0.0.1:6640 to retrieve schema: Connection refused
2023-09-27 00:37:06.413 1441539 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'privsep-helper', '--privsep_context', 'neutron.privileged.ovs_vsctl_cmd', '--privsep_sock_path', '/tmp/tmp1jgu98x3/privsep.sock']
2023-09-27 00:37:07.318 1441539 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap
2023-09-27 00:37:07.169 1441548 INFO oslo.privsep.daemon [-] privsep daemon starting
2023-09-27 00:37:07.181 1441548 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0
2023-09-27 00:37:07.187 1441548 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_NET_ADMIN|CAP_SYS_ADMIN/CAP_NET_ADMIN|CAP_SYS_ADMIN/none
2023-09-27 00:37:07.188 1441548 INFO oslo.privsep.daemon [-] privsep daemon running as pid 1441548
2023-09-27 00:37:07.781 1441539 CRITICAL neutron [-] Unhandled error: FileNotFoundError: [Errno 2] No such file or directory
2023-09-27 00:37:07.781 1441539 ERROR neutron Traceback (most recent call last):
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/agent/ovsdb/native/connection.py", line 97, in _get_ovsdb_helper
2023-09-27 00:37:07.781 1441539 ERROR neutron     return idlutils.get_schema_helper(connection, self.SCHEMA)
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 215, in get_schema_helper
2023-09-27 00:37:07.781 1441539 ERROR neutron     return create_schema_helper(fetch_schema_json(connection, schema_name))
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 204, in fetch_schema_json
2023-09-27 00:37:07.781 1441539 ERROR neutron     raise Exception("Could not retrieve schema from %s" % connection)
2023-09-27 00:37:07.781 1441539 ERROR neutron Exception: Could not retrieve schema from tcp:127.0.0.1:6640
2023-09-27 00:37:07.781 1441539 ERROR neutron
2023-09-27 00:37:07.781 1441539 ERROR neutron During handling of the above exception, another exception occurred:
2023-09-27 00:37:07.781 1441539 ERROR neutron
2023-09-27 00:37:07.781 1441539 ERROR neutron Traceback (most recent call last):
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/bin/neutron-sanity-check", line 10, in <module>
2023-09-27 00:37:07.781 1441539 ERROR neutron     sys.exit(main())
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/cmd/sanity_check.py", line 473, in main
2023-09-27 00:37:07.781 1441539 ERROR neutron     return 0 if all_tests_passed() else 1
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/cmd/sanity_check.py", line 459, in all_tests_passed
2023-09-27 00:37:07.781 1441539 ERROR neutron     return all(opt.callback() for opt in OPTS if cfg.CONF.get(opt.name))
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/cmd/sanity_check.py", line 459, in <genexpr>
2023-09-27 00:37:07.781 1441539 ERROR neutron     return all(opt.callback() for opt in OPTS if cfg.CONF.get(opt.name))
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/cmd/sanity_check.py", line 85, in check_ovs_patch
2023-09-27 00:37:07.781 1441539 ERROR neutron     result = checks.patch_supported()
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/cmd/sanity/checks.py", line 123, in patch_supported
2023-09-27 00:37:07.781 1441539 ERROR neutron     with ovs_lib.OVSBridge(name) as br:
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/agent/common/ovs_lib.py", line 243, in __init__
2023-09-27 00:37:07.781 1441539 ERROR neutron     super(OVSBridge, self).__init__()
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/agent/common/ovs_lib.py", line 134, in __init__
2023-09-27 00:37:07.781 1441539 ERROR neutron     self.ovsdb = impl_idl.api_factory()
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/agent/ovsdb/impl_idl.py", line 37, in api_factory
2023-09-27 00:37:07.781 1441539 ERROR neutron     _idl_monitor = n_connection.OvsIdlMonitor()
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/agent/ovsdb/native/connection.py", line 109, in __init__
2023-09-27 00:37:07.781 1441539 ERROR neutron     super(OvsIdlMonitor, self).__init__()
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/agent/ovsdb/native/connection.py", line 84, in __init__
2023-09-27 00:37:07.781 1441539 ERROR neutron     helper = self._get_ovsdb_helper(self._ovsdb_connection)
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/neutron/agent/ovsdb/native/connection.py", line 99, in _get_ovsdb_helper
2023-09-27 00:37:07.781 1441539 ERROR neutron     helpers.enable_connection_uri(connection)
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/oslo_privsep/priv_context.py", line 271, in _wrap
2023-09-27 00:37:07.781 1441539 ERROR neutron     return self.channel.remote_call(name, args, kwargs,
2023-09-27 00:37:07.781 1441539 ERROR neutron   File "/usr/lib/python3/dist-packages/oslo_privsep/daemon.py", line 215, in remote_call
2023-09-27 00:37:07.781 1441539 ERROR neutron     raise exc_type(*result[2])
2023-09-27 00:37:07.781 1441539 ERROR neutron FileNotFoundError: [Errno 2] No such file or directory
2023-09-27 00:37:07.781 1441539 ERROR neutron

I went back through and confirmed the neutron network option 2 configuration, and found a typo. After fixing that, and restarting the network stack I can load the dashboard: Alt text

Firewall Rules

As I go along, I updated this table with firewall rules that will need to be added for each service on my firewall.

ServiceFromToPortProtocol
memcachedNodeController11211??
rabbitmqNodeController5672??
VNCNodeController6080??
VNCLANNode6080??
KeystoneNodeController5000HTTP
GlanceNodeController9292HTTP
PlacementNodeController8778HTTP
ComputeNodeController8774HTTP

Exploring Openstack Horizon

There are two sections under project, compute and network. Compute appears to allow you to configure 'images' and 'instances' which I can only assume would provision VM's on the hardware. The only image available is the one we created while validating cinder was working properly: Alt text

Networks allows you to interact with Neutron to create virtual networks across the hosts. Alt text

To create an instance to see what that looks like, I uploaded Alpine 3.18.4: Alt text Alt text

Then to Create an instance: Alt text Alt text Alt text

Looks like I need to define a 'Flavor' first. Alt text

Heading back to instance creation: Alt text

Looks like I need a network too! Alt text

Creating a network: Alt text Alt text Alt text

Back to instance creation: Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text

And it didn't work, this is great I'm having fun. Alt text

Checking nova-api logs, there're some errors around keystone discovery:

2023-09-29 20:14:51.556 1441847 ERROR nova.api.openstack.wsgi   File "/usr/lib/python3/dist-packages/keystoneauth1/identity/generic/base.py", line 158, in _do_create_plugin
2023-09-29 20:14:51.556 1441847 ERROR nova.api.openstack.wsgi     raise exceptions.DiscoveryFailure(
2023-09-29 20:14:51.556 1441847 ERROR nova.api.openstack.wsgi keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. Unable to establish connection to https://controller/identity: HTTPSConnectionPool(host='controller', port=443): Max retries exceeded with url: /identity (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f05b98a7340>: Failed to establish a new connection: [Errno 111] ECONNREFUSED'))
2023-09-29 20:14:51.556 1441847 ERROR nova.api.openstack.wsgi
2023-09-29 20:14:51.560 1441847 INFO nova.api.openstack.wsgi [req-763a0111-bc94-4617-8388-224184833786 6b736944b2e246ccad34c99e07dd43e3 45f64608eb1e4d5181615752ba362134 - default default] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'keystoneauth1.exceptions.discovery.DiscoveryFailure'>
2023-09-29 20:14:51.562 1441847 INFO nova.osapi_compute.wsgi.server [req-763a0111-bc94-4617-8388-224184833786 6b736944b2e246ccad34c99e07dd43e3 45f64608eb1e4d5181615752ba362134 - default default] 10.100.0.20 "POST /v2.1/servers HTTP/1.1" status: 500 len: 660 time: 0.4092019

Apache error.log:

[Fri Sep 29 19:56:03.402544 2023] [wsgi:error] [pid 1575322:tid 139986902353664] [remote 10.20.0.12:51436] /usr/lib/python3/dist-packages/oslo_policy/policy.py:1119: UserWarning: Policy "os_compute_api:servers:create": "rule:project_member_api" failed scope check. The token used to make the request was domain scoped but the policy requires ['project'] scope. This behavior may change in the future where using the intended scope is required
[Fri Sep 29 19:56:03.402604 2023] [wsgi:error] [pid 1575322:tid 139986902353664] [remote 10.20.0.12:51436]   warnings.warn(msg)
[Fri Sep 29 19:56:06.594928 2023] [wsgi:error] [pid 1575322:tid 139986902353664] [remote 10.20.0.12:51446] WARNING openstack_dashboard.api.glance OPENSTACK_IMAGE_BACKEND has a format "" unsupported by glance
[Fri Sep 29 19:56:06.628638 2023] [wsgi:error] [pid 1575322:tid 139986902353664] [remote 10.20.0.12:51446] WARNING openstack_dashboard.api.glance OPENSTACK_IMAGE_BACKEND has a format "docker" unsupported by glance
[Fri Sep 29 19:56:06.628879 2023] [wsgi:error] [pid 1575322:tid 139986902353664] [remote 10.20.0.12:51446] WARNING openstack_dashboard.api.glance OPENSTACK_IMAGE_BACKEND has a format "ova" unsupported by glance
[Fri Sep 29 19:56:06.629092 2023] [wsgi:error] [pid 1575322:tid 139986902353664] [remote 10.20.0.12:51446] WARNING openstack_dashboard.api.glance OPENSTACK_IMAGE_BACKEND has a format "ploop" unsupported by glance
[Fri Sep 29 20:01:21.634129 2023] [wsgi:error] [pid 1575323:tid 139986860390144] [remote 10.20.0.12:51874] ERROR django.request Internal Server Error: /horizon/api/nova/servers/
[Fri Sep 29 20:12:28.202142 2023] [wsgi:error] [pid 1575324:tid 139986843604736] [remote 10.20.0.12:61014] ERROR django.request Internal Server Error: /horizon/api/nova/servers/
[Fri Sep 29 20:14:51.565530 2023] [wsgi:error] [pid 1575322:tid 139986877175552] [remote 10.20.0.12:52446] ERROR django.request Internal Server Error: /horizon/api/nova/servers/

At this point, I had spent far too long on this and... gave up.

Conclusion

OpenStack has all the features one would need to sell infrastrucutre like a public cloud provider would - this comes with length setup and a lot of possible customization.

The reality here for me is that this stack does not suite my use-case for my homelab. Even though it can provide similar functionality to portainer or proxmox, it's going to increase my maintainence workload significantly. If I want leverage cloud infrastructure in my environment, I can easily deploy a proxmox node in a cloud VPC or create a docker swarm node somewhere rather than go through the laborius process of setting up an OpenStack node.

I do wish I had spent more time reading the repositories of OpenStack to see how it is implemented, so perhaps one day I will come back here and finish a setup and get into the code.

References