Proxmox 6 to 7 to 8 Upgrade
While I've been away my Proxmox instance has fallen out of date, as this is the base OS on all my hardware I really need to preform this upgrade. I've been putting it off for far too long due to how labor intensive it seems from their documentation.
Backup
Before getting into this major upgade, I really should take backups of my nodes, which shamefully I've never actually done before.
For Proxmox the best practice is to take two layers of backups:
- Proxmox Host configuration
- Guest VM Snapshots
Proxmox Host
The easiest way I found was a script on Github, you can store the result wherever you want. In my case I mounted a CIFS disk and set that as my target.
Here is the process I used:
- Login to the machine to backup via SSH
- Download the script
wget -q0- https://raw.githubusercontent.com/DerDanilo/proxmox-stuff/master/prox_config_backup.sh > prox_config_backup.sh
- Edit line 16
DEFAULT_BACK_DIR
to be the storage location in your proxmox cluster you want to write the backup file to. If you want to add another location, do it from the Proxmox UIDatacenter -> Storage
. - Make the script executable
chmod +x prox_config_backup.sh
- Execute the script! It is very verbose, I suggest you read the output.
Guest Snapshots
This can and should be completed through the Proxmox UI. Just be sure that you are moving the snapshots to a remote disk.
- Click each VM you need to backup and take a snapshot! again store this somewhere external.
6.x to 7
I have two hosts to update, we'll call them Primary and Secondary. The secondary server obtains it's internet gateway through the Primary so I have to add a direct ethernet connection for preforming the upgrade instead.
Proxmox has a guide for preforming this procedure that I will reiterate in more concise steps here:
My process will be:
- Connect Serial Console and Ethernet Cable for direct internet connection
- Update to latest 6.x
apt update && apt upgrade && apt dist-upgrade
- Execute preflight check
pve6to7 --full
- Confirm MAC Addresses of adaptors are hardcoded at
/etc/network/interfaces
- Update all debian repos to Bullseye
sed -i 's/buster\/updates/bullseye-security/g;s/buster/bullseye/g' /etc/apt/sources.list
sed
no subscription repos from 6 to 7sed -i -e 's/buster/bullseye/g' /etc/apt/sources.list.d/pve-install-repo.list
- View
/etc/apt/sources.list.d/pve-enterprise.list
and/etc/apt/sources.list
to confirm repositories match expected, see repositories - Upgrade
apt update && apt dist-upgrade
- Move serial console and cable to next host and repeat 2-8
7.x to 8
Again a guide is provided by Proxmox which I will reiterate into a concise process list:
- Connect serial console and ethernet cable
- Run preflight checks
pve7to8 --full
- Confirm pve version
7.4-15
or newer withpveversion
- Update to 'Bookworm'
sed -i 's/bullseye/bookworm/g' /etc/apt/sources.list
- Confirm no Bullseye repos remain in
/etc/apt/sources.list.d/pve-enterprise.list
and/etc/apt/sources.list
- Replace bullseye with bookworm for pve repos:
sed -i -e 's/bullseye/bookworm/g' /etc/apt/sources.list.d/pve-install-repo.list
- Update and upgrade
apt update && apt dist-upgrade
6 to 7 Upgrade Log
I ran initial updates and scripts over SSH instead:
Primary Node:
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =
Checking for package updates..
PASS: all packages uptodate
Checking proxmox-ve package version..
PASS: proxmox-ve package has version >= 6.4-1
Checking running kernel version..
PASS: expected running kernel '5.4.203-1-pve'.
= CHECKING CLUSTER HEALTH/SETTINGS =
PASS: systemd unit 'pve-cluster.service' is in state 'active'
PASS: systemd unit 'corosync.service' is in state 'active'
PASS: Cluster Filesystem is quorate.
Analzying quorum settings and state..
INFO: configured votes - nodes: 2
INFO: configured votes - qdevice: 0
INFO: current expected votes: 2
INFO: current total votes: 2
WARN: cluster consists of less than three quorum-providing nodes!
Checking nodelist entries..
PASS: nodelist settings OK
Checking totem settings..
PASS: totem settings OK
INFO: run 'pvecm status' to get detailed cluster status..
= CHECKING HYPER-CONVERGED CEPH STATUS =
SKIP: no hyper-converged ceph setup detected!
= CHECKING CONFIGURED STORAGES =
WARN: storage 'external-backup' enabled but not active!
PASS: storage 'local' enabled and active.
PASS: storage 'local-lvm' enabled and active.
= MISCELLANEOUS CHECKS =
INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for running guests..
WARN: 1 running guest(s) detected - consider migrating or stopping them.
INFO: Checking if the local node's hostname 'salmonsec' is resolvable..
INFO: Checking if resolved IP is configured on local node..
PASS: Resolved node IP '10.100.0.11' configured and active on single interface.
INFO: Checking backup retention settings..
INFO: storage 'local' - no backup retention settings defined - by default, PVE 7.x will no longer keep only the last backup, but all backups
PASS: no problems found.
INFO: checking CIFS credential location..
PASS: no CIFS credentials at outdated location found.
INFO: Checking custom roles for pool permissions..
INFO: Checking node and guest description/note legnth..
PASS: All node config descriptions fit in the new limit of 64 KiB
PASS: All guest config descriptions fit in the new limit of 8 KiB
INFO: Checking container configs for deprecated lxc.cgroup entries
PASS: No legacy 'lxc.cgroup' keys found.
INFO: Checking storage content type configuration..
PASS: no problems found
INFO: Checking if the suite for the Debian security repository is correct..
INFO: Make sure to change the suite of the Debian security repository from 'buster/updates' to 'bullseye-security' - in /etc/apt/sources.list:10
SKIP: No containers on node detected.
= SUMMARY =
TOTAL: 25
PASSED: 20
SKIPPED: 2
WARNINGS: 3
FAILURES: 0
ATTENTION: Please check the output for detailed information!
Secondary Node:
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =
Checking for package updates..
PASS: all packages uptodate
Checking proxmox-ve package version..
PASS: proxmox-ve package has version >= 6.4-1
Checking running kernel version..
PASS: expected running kernel '5.4.203-1-pve'.
= CHECKING CLUSTER HEALTH/SETTINGS =
PASS: systemd unit 'pve-cluster.service' is in state 'active'
PASS: systemd unit 'corosync.service' is in state 'active'
PASS: Cluster Filesystem is quorate.
Analzying quorum settings and state..
INFO: configured votes - nodes: 2
INFO: configured votes - qdevice: 0
INFO: current expected votes: 2
INFO: current total votes: 2
WARN: cluster consists of less than three quorum-providing nodes!
Checking nodelist entries..
PASS: nodelist settings OK
Checking totem settings..
PASS: totem settings OK
INFO: run 'pvecm status' to get detailed cluster status..
= CHECKING HYPER-CONVERGED CEPH STATUS =
SKIP: no hyper-converged ceph setup detected!
= CHECKING CONFIGURED STORAGES =
WARN: storage 'external-backup' enabled but not active!
PASS: storage 'local' enabled and active.
PASS: storage 'local-lvm' enabled and active.
= MISCELLANEOUS CHECKS =
INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for running guests..
WARN: 1 running guest(s) detected - consider migrating or stopping them.
INFO: Checking if the local node's hostname 'pveworker0' is resolvable..
INFO: Checking if resolved IP is configured on local node..
PASS: Resolved node IP '10.100.0.12' configured and active on single interface.
INFO: Checking backup retention settings..
INFO: storage 'local' - no backup retention settings defined - by default, PVE 7.x will no longer keep only the last backup, but all backups
PASS: no problems found.
INFO: checking CIFS credential location..
PASS: no CIFS credentials at outdated location found.
INFO: Checking custom roles for pool permissions..
INFO: Checking node and guest description/note legnth..
PASS: All node config descriptions fit in the new limit of 64 KiB
PASS: All guest config descriptions fit in the new limit of 8 KiB
INFO: Checking container configs for deprecated lxc.cgroup entries
PASS: No legacy 'lxc.cgroup' keys found.
INFO: Checking storage content type configuration..
PASS: no problems found
INFO: Checking if the suite for the Debian security repository is correct..
INFO: Make sure to change the suite of the Debian security repository from 'buster/updates' to 'bullseye-security' - in /etc/apt/sources.list:6
SKIP: No containers on node detected.
= SUMMARY =
TOTAL: 25
PASSED: 20
SKIPPED: 2
WARNINGS: 3
FAILURES: 0
ATTENTION: Please check the output for detailed information!
Warnings:
- WARN: storage 'external-backup' enabled but not active!
- WARN: 1 running guest(s) detected - consider migrating or stopping them.
- WARN: cluster consists of less than three quorum-providing nodes!
These are all acceptable:
- 'external-backup' is a Samba drive that is offline, this is fine
- I'll stop all guests before I do the update procedure, once I connect via serial console
- I only have two nodes, sad me but that's all I can do!
My network interfaces do not have hardcoded MAC Addresses on either node. What exactly does the proxmox Wiki say for this?
With Proxmox VE 7, the MAC address of the Linux bridge itself may change, as noted in Upgrade from 6.x to 7.0#Linux Bridge MAC-Address Change.
In hosted setups, the MAC address of a host is often restricted, to avoid spoofing by other hosts.
Each of my subnets has a bridge so I certainly don't want them to mess up, however the MAC address is not restricted in my environment so perhaps I do not need to worry? Solution A is to use ifupdown2, I'm not sure if I'm already using that.
Both hosts show it is installed and is above the minimum declared version from the docs, so I'm just going to not worry about this and hope for the best!
Finally, I updated the repos then logged out of SSH session.
I stopped all Virtual Machines on the cluster, then plugged into the serial console and ran apt update && apt dist-upgrade
I hit the issue:
Upgrade wants to remove package 'proxmox-ve'
To resolve I followed the Wiki suggestion and did apt remove linux-image-amd64
but this package was not installed so it changed nothing.
I first installed the kernel helper, and rebooted:
apt install pve-kernel-helper && reboot now
This didn't help.
Then I tried a suggestion from here. I used to have a Ceph cluster but stopping using it. Apparently you need to add those repos back to get the upgrade to work:
echo "deb http://download.proxmox.com/debian/ceph-octopus bullseye main" > /etc/apt/sources.list.d/ceph.list
apt update
apt dist-upgrade -y
It worked:
I preformed the same steps on the secondary node, however that one got stuck at 99% with a repeated error message:
proc: Bad value for 'hidepid'
Apparently this is harmless so I just continued to wait... I know my disks on this machine are failing and extremely slow. Turns out I just had to hit enter... there was a prompt buried under all the messages from proc.
I fixed my routes and reboot both nodes to get the cluster back to a healthy state. One of my VM's wouldn't start on the secondary nodes with the error:
TASK ERROR: activating LV 'pve/data' failed: Activation of logical volume pve/data is prohibited while logical volume pve/data_tmeta is active.
The following commands resolved the issue:
lvchange -an pve/data_tdata
lvchange -an pve/data_tmeta
lvchange -ay pve/data
This actually failed, with errors citing my PV metadata was corrupted... thankfully a restart resolved the issue.
7 to 8 Upgrade Log
Primary Node:
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =
Checking for package updates..
PASS: all packages up-to-date
Checking proxmox-ve package version..
PASS: proxmox-ve package has version >= 7.4-1
Checking running kernel version..
PASS: running kernel '5.15.116-1-pve' is considered suitable for upgrade.
= CHECKING CLUSTER HEALTH/SETTINGS =
PASS: systemd unit 'pve-cluster.service' is in state 'active'
PASS: systemd unit 'corosync.service' is in state 'active'
PASS: Cluster Filesystem is quorate.
Analzying quorum settings and state..
INFO: configured votes - nodes: 2
INFO: configured votes - qdevice: 0
INFO: current expected votes: 2
INFO: current total votes: 2
WARN: cluster consists of less than three quorum-providing nodes!
Checking nodelist entries..
PASS: nodelist settings OK
Checking totem settings..
PASS: totem settings OK
INFO: run 'pvecm status' to get detailed cluster status..
= CHECKING HYPER-CONVERGED CEPH STATUS =
SKIP: no hyper-converged ceph setup detected!
= CHECKING CONFIGURED STORAGES =
WARN: storage 'external-backup' enabled but not active!
PASS: storage 'local' enabled and active.
PASS: storage 'local-lvm' enabled and active.
INFO: Checking storage content type configuration..
PASS: no storage content problems found
WARN: activating 'external-backup' failed - storage 'external-backup' is not online
PASS: no storage re-uses a directory for multiple content types.
= MISCELLANEOUS CHECKS =
INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvescheduler.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for supported & active NTP service..
WARN: systemd-timesyncd is not the best choice for time-keeping on servers, due to only applying updates on boot.
While not necessary for the upgrade it's recommended to use one of:
* chrony (Default in new Proxmox VE installations)
* ntpsec
* openntpd
INFO: Checking for running guests..
WARN: 1 running guest(s) detected - consider migrating or stopping them.
INFO: Checking if the local node's hostname 'salmonsec' is resolvable..
INFO: Checking if resolved IP is configured on local node..
PASS: Resolved node IP '10.100.0.11' configured and active on single interface.
INFO: Check node certificate's RSA key size
PASS: Certificate 'pve-root-ca.pem' passed Debian Busters (and newer) security level for TLS connections (4096 >= 2048)
PASS: Certificate 'pve-ssl.pem' passed Debian Busters (and newer) security level for TLS connections (2048 >= 2048)
INFO: Checking backup retention settings..
PASS: no backup retention problems found.
INFO: checking CIFS credential location..
PASS: no CIFS credentials at outdated location found.
INFO: Checking permission system changes..
INFO: Checking custom role IDs for clashes with new 'PVE' namespace..
PASS: no custom roles defined, so no clash with 'PVE' role ID namespace enforced in Proxmox VE 8
INFO: Checking if LXCFS is running with FUSE3 library, if already upgraded..
SKIP: not yet upgraded, no need to check the FUSE library version LXCFS uses
INFO: Checking node and guest description/note length..
PASS: All node config descriptions fit in the new limit of 64 KiB
PASS: All guest config descriptions fit in the new limit of 8 KiB
INFO: Checking container configs for deprecated lxc.cgroup entries
PASS: No legacy 'lxc.cgroup' keys found.
INFO: Checking if the suite for the Debian security repository is correct..
PASS: found no suite mismatch
INFO: Checking for existence of NVIDIA vGPU Manager..
PASS: No NVIDIA vGPU Service found.
INFO: Checking bootloader configuration...
SKIP: not yet upgraded, no need to check the presence of systemd-boot
SKIP: No containers on node detected.
= SUMMARY =
TOTAL: 36
PASSED: 27
SKIPPED: 4
WARNINGS: 5
FAILURES: 0
ATTENTION: Please check the output for detailed information!
Secondary Node:
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =
Checking for package updates..
PASS: all packages up-to-date
Checking proxmox-ve package version..
PASS: proxmox-ve package has version >= 7.4-1
Checking running kernel version..
PASS: running kernel '5.15.116-1-pve' is considered suitable for upgrade.
= CHECKING CLUSTER HEALTH/SETTINGS =
PASS: systemd unit 'pve-cluster.service' is in state 'active'
PASS: systemd unit 'corosync.service' is in state 'active'
PASS: Cluster Filesystem is quorate.
Analzying quorum settings and state..
INFO: configured votes - nodes: 2
INFO: configured votes - qdevice: 0
INFO: current expected votes: 2
INFO: current total votes: 2
WARN: cluster consists of less than three quorum-providing nodes!
Checking nodelist entries..
PASS: nodelist settings OK
Checking totem settings..
PASS: totem settings OK
INFO: run 'pvecm status' to get detailed cluster status..
= CHECKING HYPER-CONVERGED CEPH STATUS =
SKIP: no hyper-converged ceph setup detected!
= CHECKING CONFIGURED STORAGES =
WARN: storage 'external-backup' enabled but not active!
PASS: storage 'local' enabled and active.
PASS: storage 'local-lvm' enabled and active.
INFO: Checking storage content type configuration..
PASS: no storage content problems found
WARN: activating 'external-backup' failed - storage 'external-backup' is not online
PASS: no storage re-uses a directory for multiple content types.
= MISCELLANEOUS CHECKS =
INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvescheduler.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for supported & active NTP service..
WARN: systemd-timesyncd is not the best choice for time-keeping on servers, due to only applying updates on boot.
While not necessary for the upgrade it's recommended to use one of:
* chrony (Default in new Proxmox VE installations)
* ntpsec
* openntpd
INFO: Checking for running guests..
WARN: 1 running guest(s) detected - consider migrating or stopping them.
INFO: Checking if the local node's hostname 'pveworker0' is resolvable..
INFO: Checking if resolved IP is configured on local node..
PASS: Resolved node IP '10.100.0.12' configured and active on single interface.
INFO: Check node certificate's RSA key size
PASS: Certificate 'pve-root-ca.pem' passed Debian Busters (and newer) security level for TLS connections (4096 >= 2048)
PASS: Certificate 'pve-ssl.pem' passed Debian Busters (and newer) security level for TLS connections (2048 >= 2048)
INFO: Checking backup retention settings..
PASS: no backup retention problems found.
INFO: checking CIFS credential location..
PASS: no CIFS credentials at outdated location found.
INFO: Checking permission system changes..
INFO: Checking custom role IDs for clashes with new 'PVE' namespace..
PASS: no custom roles defined, so no clash with 'PVE' role ID namespace enforced in Proxmox VE 8
INFO: Checking if LXCFS is running with FUSE3 library, if already upgraded..
SKIP: not yet upgraded, no need to check the FUSE library version LXCFS uses
INFO: Checking node and guest description/note length..
PASS: All node config descriptions fit in the new limit of 64 KiB
PASS: All guest config descriptions fit in the new limit of 8 KiB
INFO: Checking container configs for deprecated lxc.cgroup entries
PASS: No legacy 'lxc.cgroup' keys found.
INFO: Checking if the suite for the Debian security repository is correct..
PASS: found no suite mismatch
INFO: Checking for existence of NVIDIA vGPU Manager..
PASS: No NVIDIA vGPU Service found.
INFO: Checking bootloader configuration...
SKIP: not yet upgraded, no need to check the presence of systemd-boot
SKIP: No containers on node detected.
= SUMMARY =
TOTAL: 36
PASSED: 27
SKIPPED: 4
WARNINGS: 5
FAILURES: 0
ATTENTION: Please check the output for detailed information!
Warning Summary:
- Cluster not big enough for HA pair
external-backup
not online- Guests are running
- systemd-timesyncd is not the best choice for time-keeping on servers, due to only applying updates on boot.
These are all acceptable except for the recommendation to change the time-keeping method being used. I have ran into sync issues in the past on these nodes so changing it is welcomed.
To make this change, you simply install chrony which will automatically clean up the previously used systemd-timesyncd
.
On both nodes:
apt install chrony
I then re-ran pve7to8
to confirm that warning was resolved.
Then:
- Confirm pve version >7.4-15
- Update package repos
- This time there is no bookworm ceph, so I removed it entirely
- Connect via serial console, connect ethernet and fix default route
- Preform Update
Once again, everything went more smooth than I expected. The biggest time sink on this project was performing backups and debugging issues with storage.