Introduction
Currently, I have a single Proxmox node with 1 VM and 8 LXC containers. These containers all run fairly simple individual services, while the singular VM just runs Docker for the few things that really are best done in Docker.
Inspired by Docker Compose, and more specifically how easy their compose files make embracing server as cattle, I wanted to try and glue together some automation for Proxmox.
My immediate goal would be to convert most (all?) of my containers into scripts that can dynamically create containers with known-good state (defined in config files and code). This would help with upgrades, bug squashing, speed to create new containers, and disaster recovery (maybe).
In the future I wouldn’t mind moving some of my docker containers into LXC containers and potentially even getting rid of that VM to save on resources, but some of them use pretty involved Docker Networking on third party images so I’m not holding my breath quite yet.
This is the game plan:- Figure out how to create PVE containers using pct
- Figure out how to assign meta-container configuration (memory, CPU, mount points, privileged vs unprivileged)
- Figure out how to copy files from the host into the containers (config files, setup scripts)
-
Figure out how to actually shut down an existing PVE container
- Do I need to destroy the container to create a new one? Or can I just shutdown and “archive” somehow?
- Does this do anything weird with volumes, logs, etc? Any footguns before nuking a container I’m actually using?
</li> <li>Clean up the script so that it can be easily extended for different containers</li> <li>Actually start to migrate some existing containers and their data into this system</li> <li>BONUS: Figure out if there is a way to add this button in the PVE UI somewhere (don't judge me I like the GUI)</li>
Whew.
Steps
1. pct
Basics
pct
provides an interface for creating containers natively without having to
use the GUI. Much of the below steps were gathered from the official
PVE documentation.
First we want to create a new container. Start by finding your existing container template if you already have one. I know the only template I use for Debian 12 is located here:
You can usepveam
to
find the name that is used to refer to that template in Proxmox:
root@pve:~# pveam list zfs-dir
NAME SIZE
zfs-dir:vztmpl/debian-12-standard_12.2-1_amd64.tar.zst 120.29MB
Then you can use that information to create a basic container like so:
pct create 150 zfs-dir:vztmpl/debian-12-standard_12.2-1_amd64.tar.zst
This fails:
root@pve:~# pct create 150 zfs-dir:vztmpl/debian-12-standard_12.2-1_amd64.tar.zst
400 Parameter verification failed.
storage: storage 'local' does not support container directories
pct create [OPTIONS]
Because my default storage location (local
) does not allow
for container storage volumes. Instead I use my ZFS array titled
zfs-main
for that purpose. We’ll need to append that to the
create command so it puts it in the right place:
pct create 150 zfs-dir:vztmpl/debian-12-standard_12.2-1_amd64.tar.zst --storage zfs-main
And now it works!
root@pve:~# pct create 150 zfs-dir:vztmpl/debian-12-standard_12.2-1_amd64.tar.zst --storage zfs-main
extracting archive '/zfs/backups/template/cache/debian-12-standard_12.2-1_amd64.tar.zst'
Total bytes read: 519987200 (496MiB, 91MiB/s)
Detected container architecture: amd64
Creating SSH host key 'ssh_host_rsa_key' - this may take some time ...
done: SHA256:QSlHc6fDLKNztb3UalfntLvfTjIfazsKsK6ovS9U0aQ root@localhost
Creating SSH host key 'ssh_host_dsa_key' - this may take some time ...
done: SHA256:+3z7yUefkBegm4rjcyQnovIG7pzBsxYcxuyjYFOZ2zY root@localhost
Creating SSH host key 'ssh_host_ecdsa_key' - this may take some time ...
done: SHA256:Gca9h6kIHX71mihXzB0vi62MjewQn6mrfHrY+Y2wxX8 root@localhost
Creating SSH host key 'ssh_host_ed25519_key' - this may take some time ...
done: SHA256:t3o9jWNja5CCj+Gd8lVjbV4V/cQ/rIai1hpVy8VPiZs root@localhost
root@pve:~# pct list
VMID Status Lock Name
101 running tailscale
102 running ns1
103 running ns2
104 running ripe-probe
105 running wiki
106 running dashboard
107 running cacti
108 running cache
150 stopped CT150
2. pct
Additional
Details
Now that we have a proof of concept we need to start fine-tuning some
of these parameters. Poking around the GUI at little 150
I’m noticing a lot of different things like networking, storage, CPU,
etc we’re missing.
root@pve:~# cat /opt/pve-new-ct.sh
#!/bin/bash
# If did not supply LXC ID, die
[ -z "${1}" ] && printf "Usage: $0 \n" && exit 1
# Create new container
pct create "${1}" zfs-dir:vztmpl/debian-12-standard_12.2-1_amd64.tar.zst --storage zfs-main
root@pve:~# chmod +x /opt/pve-new-ct.sh
root@pve:~# /opt/pve-new-ct.sh 150
CT 150 already exists on node 'pve'
This works but as you can see we’re blindly trying to create new
containers without checking if one is there already. Let’s add some
additional logic. We can check what IDs are in use via
/etc/pve/.vmlist
:
root@pve:~# cat /etc/pve/.vmlist
{
"version": 3255,
"ids": {
"104": { "node": "pve", "type": "lxc", "version": 3217 },
"201": { "node": "pve", "type": "qemu", "version": 3259 },
"103": { "node": "pve", "type": "lxc", "version": 3207 },
"102": { "node": "pve", "type": "lxc", "version": 3197 },
"108": { "node": "pve", "type": "lxc", "version": 3257 },
"101": { "node": "pve", "type": "lxc", "version": 3187 },
"107": { "node": "pve", "type": "lxc", "version": 3247 },
"106": { "node": "pve", "type": "lxc", "version": 3237 },
"105": { "node": "pve", "type": "lxc", "version": 3227 },
"150": { "node": "pve", "type": "lxc", "version": 3263 }}
}
And gather status using pct list
:
root@pve:~# pct list
VMID Status Lock Name
101 running tailscale
102 running ns1
103 running ns2
104 running ripe-probe
105 running wiki
106 running dashboard
107 running cacti
108 running cache
150 stopped CT150
# Check if ID is in use
if grep -q \""${1}"\" /etc/pve/.vmlist; then
printf "ID %s already in use...\n" "$1"
STATUS_LIST=$(pct list | grep -E "^${1}" | tr -s ' ')
STATUS=$(echo "$STATUS_LIST" | cut -d' ' -f2)
NAME=$(echo "$STATUS_LIST" | cut -d' ' -f3)
while true; do
printf "Container %s (%s) is currently %s.\n" "$NAME" "$1" "$STATUS"
read -rp "Continue and destroy container? (y/n) " yn
case $yn in
[Yy]* ) break;;
[Nn]* ) exit;;
esac
done
else
printf "ID %s is not in use. Continue...\n" "$1"
EXISTING=0
fi
And it seems to work OK:
root@pve:~# /opt/pve-new-ct.sh 150
ID 150 already in use...
Container CT150 (150) is currently stopped.
Continue and destroy container? (y/n) y
root@pve:~# /opt/pve-new-ct.sh 151
ID 151 is not in use. Continue...
root@pve:~# /opt/pve-new-ct.sh 102
ID 102 already in use...
Container ns1 (102) is currently running.
Continue and destroy container? (y/n) n
Now that the script feels a little more controlled I’ll start adding
cleanup functionality too. In general this will only ever be used for a
full delete and re-build so I’m not going to bother trying to edit any
existing containers.
# If this had a previous container destroy it first
if [ -z "$EXISTING" ]; then
printf "Destroying CT %s...\n" "$1"
pct destroy "$1" --purge || { echo "Failed to purge container, exiting..."; exit 1; }
fi
Finally time to get into the bulk of the pct
options. Most
of these things it seems like you have the choice of doing at
creation time or after creation time. I’ll do as much as
possible at creation because that just saves an extra step.
Here is the big long pct create
command I ended up with:
# Create new container
pct create "${1}" \
zfs-dir:vztmpl/debian-12-standard_12.2-1_amd64.tar.zst \
--storage zfs-main \
--hostname "Testing" \
--description "Testing description" \
--onboot 1 \
--cores 1 \
--memory 256 \
--swap 256 \
--features nesting=1 \
--timezone host \
--unprivileged 1 \
--net0 name=eth0,bridge=vmbr0,firewall=1,ip=192.168.30.150/24,gw=192.168.30.1,ip6=fd00:30::150/64,gw6=fd00:30::1 \
--start 1 \
|| { echo "Failed to create container, exiting..."; exit 1; }
pct status "$1" --verbose
This script will need cleaned up so you can dynamically import things like hostname, needed memory, IP addresses, etc but for now it should be good enough.
3. Moving Files into the Container
Now that we can dynamically create containers with specific options we need to start the actual setup process. My base ZFS volume is mounted on the Proxmox host under/zfs
so I’m going to start adding
things there. I’ll use /zfs/lxc/<CT>
as the root dir
for files under each container. I’m also going to have some shared files
for common setup so those will go under /zfs/lxc/shared
:
houck@pve:~$ tree /zfs/lxc/
/zfs/lxc/
├── shared
│ └── opt
│ └── setup.sh
└── testing
├── opt
│ └── hello-world.sh
└── root
└── hello-world.txt
6 directories, 3 files
We’ll modify our initial script to try copying all 3 of these files into
the container filesystem, then executing both startup scripts.
# Copy needed files into container filesystem
printf "Copying files into CT %s...\n" "$1"
pct push "$1" /zfs/lxc/shared/opt/setup.sh /opt/setup.sh
pct push "$1" /zfs/lxc/testing/opt/hello-world.sh /opt/hello-world.sh
pct push "$1" /zfs/lxc/testing/root/hello-world.txt /root/hello-world.txt
If we login manually we can see the files are there:
# Copy needed files into container filesystem
printf "Copying files into CT %s...\n" "$1"
pct push "$1" /zfs/lxc/shared/opt/setup.sh /opt/setup.sh
pct push "$1" /zfs/lxc/testing/opt/hello-world.sh /opt/hello-world.sh
pct push "$1" /zfs/lxc/testing/root/hello-world.txt /root/hello-world.txt
Now we can start running setup commands.
# Run setup commands
printf "Running setup in CT %s...\n" "$1"
pct exec "$1" -- bash -c 'chmod +x /opt/*.sh'
pct exec "$1" -- bash -c '/opt/setup.sh'
pct exec "$1" -- bash -c '/opt/hello-world.sh'
pct exec "$1" -- bash -c 'cat /root/hello-world.txt'
And this works as expected:
houck@pve:~$ sudo /opt/pve-new-ct.sh 150
ID 150 already in use...
Container Testing (150) is currently running.
Continue and destroy container? (y/n) y
Shutting down CT 150...
Destroying CT 150...
Could not execute systemctl: at /usr/bin/deb-systemd-invoke line 145.
Setting up ssh (1:9.2p1-2+deb12u3) ...
Processing triggers for man-db (2.11.2-2) ...
Processing triggers for libc-bin (2.36-9+deb12u3) ...
hello world
There is a lot more functionality I need to add to my setup script to create a fully fledged out system, but I’ll worry about that in step 5 once we are ready to start looking at “prod” deployments.
4. Gracefully Deleting the Previous Container
While messing with this I noticed that I was missing some nightly
backups for my test LXC, 150
. In the picture below notice I
only have a backup from the 14th, while this container was first created
mid-day on the 12th:
--purge
part of
pct destroy XXX --purge
did…
root@pve:/home/houck# pct help destroy
USAGE: pct destroy [OPTIONS]
Destroy the container (also delete all uses files).
...
--purge (default=0)
Remove container from all related configurations. For example,
backup jobs, replication jobs or HA. Related ACLs and Firewall
entries will *always* be removed.
That is, in fact, the whole point. I’ve modified the script to no longer use the purge flag and now it no longer yeets all of my mostly-valid backups when rebuilding a container.
While toying with this I was curious if taking a Snapshot right before deletion would make sense. Because snapshots capture RAM and state theoretically if the new container goes awry this would make it easier to rebuild from scratch.
However snapshots do not persist when a container is deleted and re-created. They aren’t separate entities like full backups, but rather are linked to the container. Instead, I do think it would be wise to do a manual full backup right before destroying the container.
This can be done via vzdump relatively simply:
// Options mostly copied from `/etc/pve/jobs.cfg`
printf "Creating backup of CT %s...\n" "$1"
vzdump "$1" \
--compress zstd \
--fleecing 0 \
--mailnotification always \
--mailto pve@aidanhouck.com \
--mode snapshot \
--node pve \
--notes-template "Pre-rebuild: {{guestname}}" \
--storage zfs-dir \
|| { echo "Failed to backup container, exiting..."; exit 1; }
5. Final Primary Script Cleanup
Now that the big steps are sensible I want to finish cleaning up this script. I left a few hard coded IPs, files, etc in there that need to come out before moving a prod container.
I ended up modifying the script to take a directory name as parameter 2:
CONFIG_ABOUT="/zfs/lxc/${2}/about.sh"
CONFIG_FILES="/zfs/lxc/${2}/files.sh"
# If did not supply path to container info, die
[ ! -f "${CONFIG_ABOUT}" ] && printf "ERROR: /zfs/lxc/%s/about.sh cannot be read!\n" "$2" && exit 1
[ ! -f "${CONFIG_FILES}" ] && printf "ERROR: /zfs/lxc/%s/files.sh cannot be read!\n" "$2" && exit 1
# shellcheck source=/dev/null
source "$CONFIG_ABOUT" || { printf "Failed to source %s, exiting..." "$CONFIG_ABOUT"; exit 1; }
...
# Use files.sh to copy needed files into the container and execute them
# shellcheck source=/dev/null
source "$CONFIG_FILES" || { printf "Failed to source %s, exiting..." "$CONFIG_FILES"; exit 1; }
These files will look something like this:
root@pve:~# cat /zfs/lxc/testing/about.sh
#!/bin/bash
CT_NAME="Testing"
CT_DESC="Testing description"
CT_CORES="1"
CT_MEMORY="256"
CT_SWAP="256"
CT_UNPRIV="1"
CT_BRIDGE="vmbr0"
CT_IP4="192.168.30.150/24"
CT_IP4_GW="192.168.30.1"
CT_IP6="fd00:30::150/64"
CT_IP6_GW="fd00:30::1"
root@pve:~# cat /zfs/lxc/testing/files.sh
#!/bin/bash
# Copy needed files into container filesystem
printf "Copying files into CT %s...\n" "$1"
pct push "$1" /zfs/lxc/shared/opt/setup.sh /opt/setup.sh
pct push "$1" /zfs/lxc/testing/opt/hello-world.sh /opt/hello-world.sh
pct push "$1" /zfs/lxc/testing/root/hello-world.txt /root/hello-world.txt
# Run setup commands
printf "Running setup in CT %s...\n" "$1"
pct exec "$1" -- bash -c 'chmod +x /opt/*.sh'
pct exec "$1" -- bash -c '/opt/setup.sh'
pct exec "$1" -- bash -c '/opt/hello-world.sh'
pct exec "$1" -- bash -c 'cat /root/hello-world.txt'
The shellchecks are needed for SC1090, but besides that no real issues. Now I just need to make new directories and copy template/script files over for each container.
6. Putting It All Together
Finally it’s time to finally move a single container over just to finish the proof of concept up. I keep pretty decent notes in my wiki when setting something up for the first time, but they’re mostly interactive configuration. Things like config files, databases, and changes I forgot about are all going to be tough to deal with.
I’m going to start with Tailscale.
First, scaffolding using the template dir:
cp -r /zfs/lxc/shared /zfs/lxc/tailscale
mv /zfs/lxc/tailscale/about-template.sh /zfs/lxc/tailscale/about.sh
mv /zfs/lxc/tailscale/files-template.sh /zfs/lxc/tailscale/files.sh
mv /zfs/lxc/tailscale/opt/setup-sh /zfs/lxc/tailscale/opt/tailscale-setup.sh
Then tweaking about.sh
and files.sh
with
reasonable values. I mostly just mirrored what my existing Tailscale
container was running with.
tailscale-setup.sh
ended up looking like:
# Install curl and Tailscale
apt-get install -y curl
curl -fsSL https://tailscale.com/install.sh | sh
# Make necessary changes for acting as an exit node
echo 'net.ipv4.ip_forward = 1' | tee -a /etc/sysctl.d/99-tailscale.conf
echo 'net.ipv6.conf.all.forwarding = 1' | tee -a /etc/sysctl.d/99-tailscale.conf
sysctl -p /etc/sysctl.d/99-tailscale.conf
# Change default config to allow tailscail to run
# in an unprivileged container
#cp /etc/default/tailscaled /etc/default/tailscaled.bak
#sed -iE 's/FLAGS.*$/FLAGS="--tun=userspace-networking --socks5-server=localhost:1055 --outbound-http-proxy-listen=localhost:1055"/' /etc/default/tailscaled
# Verify the status is started
systemctl start tailscaled
# Bring up with authkey
tailscale up --authkey=`cat /opt/tailscale-authkey` --advertise-routes="192.168.0.0/16" --advertise-exit-node --advertise-tags=tag:container --statedir=/var/lib/tailscale
tailscale status
/opt/tailscale-authkey
was generated manually from
https://login.tailscale.com/admin/settings/keys and expires after 90
days. I might look into auto-refreshes via API? Or potentially disabling
expiration.
The new containers were not coming up as exit nodes due to not having the routes approved in the dashboard first:
This was a pretty easy fix using the beta auto approvers feature. I edited my ACL in the Tailscale dashboard with the following sections:
// Define the tags which can be applied to devices and by which users.
"tagOwners": {
"tag:container": ["autogroup:admin"],
},
"autoApprovers": {
"routes": {
"192.168.0.0/16": ["tag:container"],
},
"exitNode": ["tag:container"],
},
Now this all works, but after recreating the container it won’t remove
the old one from the admin GUI. There may be a way to auto-delete
intelligently but in reality this should probably be done similar to the
docker container’s
setup with a persistent state directory.
# Create world read-writable dir for Tailscale to use
mkdir -p /zfs/lxc/tailscale/persist
chmod 777 tailscale/persist/
# Add to `tailscale-setup.sh`:
--statedir=/persist/tailscale --state/persist/tailscale/tailscaled.state
# Needed because the default systemd service includes a
# '--state=/...' parameter, $FLAGS (from /etc/default/tailscaled)
# are placed AFTER the default parameter. Evidently Tailscale
# prefers the first duplicate flag defined, not the most recent...
mkdir /etc/systemd/system/tailscaled.service.d
echo '[Service]' > /etc/systemd/system/tailscaled.service.d/override.conf
echo 'ExecStart=' >> /etc/systemd/system/tailscaled.service.d/override.conf
echo 'ExecStart=/usr/sbin/tailscaled --socket=/run/tailscale/tailscaled.sock --port=${PORT} $FLAGS' >> /etc/systemd/system/tailscaled.service.d/override.conf
systemctl daemon-reload
systemctl restart tailscaled.service
# Add to `tailscale-files.sh`:
echo 'mp0: /zfs/lxc/tailscale/persist,mp=/persist' | tee -a /etc/pve/lxc/${1}.conf
Now when I delete and recreate the container it proper re-registers as the same, existing machine in the Tailscale dashboard. Testing seems good so now the last step is migrating the Prod container.
I took a manual backup for good luck (PVE->CT101->Backup->Backup Now) then let it fly:
pct shutdown 111
rm -r /zfs/lxc/tailscale/persist/tailscale/*
vim /zfs/lxc/tailscale/about.sh
-> edit non-conflicting IP addresses to intended IP addresses
-> edit hostname to intended hostname
/opt/pve-new-ct.sh 101 tailscale
While this was in the middle of building I went ahead and deleted the old machine from Tailscale’s dashboard. This is easier than trying to migrate state when there is really nothing important to migrate.
After a few minutes it finished installation and popped up in the dashboard:
7. Extra: Hacking the PVE GUI
I like using the Proxmox GUI (I know, I know), so I figured it might be fun to try and hack on it to insert my own custom button for this script.
Looking online, some people have done similar things but not quite adding custom functionality: https://lunar.computer/news/customize-proxmox-60/
Before worrying about persistence I should try and get it working. Starting with the source code I was able to browse to these lines:
Ext.define('PVE.lxc.Config', {
...
var moreBtn = Ext.create('Proxmox.button.Button', {
text: gettext('More'),
menu: {
items: [
{
text: gettext('Clone'),
iconCls: 'fa fa-fw fa-clone',
hidden: !caps.vms['VM.Clone'],
handler: function() {
PVE.window.Clone.wrap(nodename, vmid, template, 'lxc');
},
},
{
...
This looks like where the “More” dropdown for LXC containers is defined, which seems like a perfect place to add my own button.
Scouring around my local install it seems like all of these JS files
get compiled into
/usr/share/pve-manager/js/pvemanagerlib.js
. Editing that
file around line 34445
(Your mileage may vary…) I found the
exact same code as above.
Adding my own item
element worked very easily:
{
text: gettext('Rebuild Container'),
disabled: template,
xtype: 'pveMenuItem',
iconCls: 'fa fa-terminal',
hidden: false,
confirmMsg: 'Are you sure?',
handler: function() {
Proxmox.Utils.API2Request({
url: base_url + '/template',
waitMsgTarget: me,
method: 'POST',
failure: function(response, opts) {
Ext.Msg.alert('Error', response.htmlStatus);
},
});
},
},
This adds a button, but you may notice the handler:
looks a bit off. This is where I got stuck. Somehow, you want this
button to register and run a script on the server
(/opt/pve-new-ct.sh $CT_ID $CT_NAME
). I’m guessing you’ll
need to determine where exactly the API is defined and add a new API
endpoint specifically for this action, possibly even adding it as a job
that gets shown with a proper status and everything.
That is a lot more work than I’m willing to entertain right now, so this is where I’ll leave it. There very well may be an easier way to hack this together without all of that footwork, but I wasn’t able to figure one out quite yet.
Conclusion
After all of this work I’ve accomplished the following:-
Learn a bit more about managing Proxmox via
pct
-
Script some basic
lxc
creation, backups, and deletion - Build out shared scripting for common setup actions I was copy-pasting (auto updates, SSH, and email alerting)
- Put it all together and fully “define” one of my containers
Hopefully some of this will be helpful, or at the very least interesting. I don’t endorse blindly copying my hackjob scripts but if you’re looking to do something similar and want some inspiration feel free.