Getting a clean boot
Apr 14, 2025
lenovo
homelab
kubernetes
nfsroot
read-only-root
tmpfs
/var
shared
Putting together a homelab Kubernetes cluster in my own stubborn way. I’m assuming a reader who’s basically me before I embarked on this little expedition, so I won’t go into minute detail about day-to-day Linux setup and administration - only the things that are new to me and have changed since I last encountered them.
- Part 0 - Best laid plans
- Part 1 - Installing the hardware
- Part 2 - Boot across the network
- Part 3 - PXE Booting Debian with an NFS Root Filesystem
- Part 4 - Filesystems for everybody!
- Part 5 - Getting a clean boot
- Part 6 - Kubernetes at last
Sections added as I actually proceed with this!
Cleaning up
One thing to address from part 4, it seems that ping is literally the only binary that was using capabilities. Is that weird? It seems weird to me…
$ sudo su -
$ cd /clients
$ find . -type f -executable -exec getcap {} \;
./usr/bin/ping cap_net_raw=ep
…but it does seem to be true. Good, I guess? If that was the only thing using them then giving it the setuid bit should be sufficient. Given that I’ve got no security on my NFS share within my local network it’s definitely not the biggest security issue at hand to grant it that way.
I checked on my laptop running Ubuntu as well, and that does have a few extra items (some of them binaries within
containers). Lots of copies of ping
and then:
arping
- network ARP pinging tool, part of an Amazon Corretto image layerclockdiff
- tool to measure clock differences between hosts, part of an Amazon Corretto image layernewgidmap
- gid mapping for user namespaces, part of an Amazon Corretto image layernewuidmap
- uid mapping for user namespaces, part of an Amazon Corretto image layerdumpcap
- a network traffic capture toolmtr-packet
- a network probing toolgst-ptp-helper
- part of GStreamer, a media pipeline
Most of that doesn’t matter because it’s unlikely to ever run in this cluster, but I’ll remember¹ to look out for
issues with newgidmap
and newuidmap
as the Kubernetes stuff is definitely going to involve some container
namespaces!
I should probably keep an eye open for any capabilities settings with the Kubernetes binaries when I get to them as well.
So what else is sad?
Nothing concerning in the dmesg log output, but what about the systemd units?
dcminter@worker-node-448a5bddd8ba:~$ systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● apt-daily-upgrade.service loaded failed failed Daily apt upgrade and clean activities
● apt-daily.service loaded failed failed Daily apt download activities
● logrotate.service loaded failed failed Rotate log files
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
3 loaded units listed.
Well, that makes sense for apt-daily-upgrade
and apt-daily
. I should probably just turn off the apt updates. This
would be a horrible thing in a production system as I wouldn’t get security updates, but we’ve established this cluster
isn’t going to be secure anyway. Once I figure out how to get it running at all I’ll worry about how to make sure the
master read-only worker node image gets updated.
The logrotate
not working is also not surprising; the logs are on a read-only filesystem so (a) I can’t write logs
and (b) the log rotation can’t do anything to them. Time to make a decision there…
Sorting out /var
According to the Linux Foundation’s Filesystem Hierarchy Standard (FHS)…
Some portions of
/var
are not shareable between different systems. For instance,/var/log
,/var/lock
, and/var/run
. Other portions may be shared, notably/var/mail
,/var/cache/man
,/var/cache/fonts
, and/var/spool/news
.
Uh… what’s actually under /var
on these worker nodes?
dcminter@worker-node-448a5bddd8ba:~$ ls -al /var
total 44
drwxr-xr-x 11 root root 4096 Jun 29 2024 .
drwxrwxr-x 17 root root 4096 Oct 12 20:18 ..
drwxr-xr-x 2 root root 4096 Jun 30 2024 backups
drwxr-xr-x 7 root root 4096 Jun 29 2024 cache
drwxr-xr-x 14 root root 4096 Jun 29 2024 lib
drwxr-xr-x 2 root root 4096 Oct 12 21:12 local
lrwxrwxrwx 1 root root 9 Jun 29 2024 lock -> /run/lock
drwxr-xr-x 6 root root 4096 Oct 11 16:00 log
drwxrwsr-x 2 root mail 4096 Jun 29 2024 mail
drwxr-xr-x 2 root root 4096 Jun 29 2024 opt
lrwxrwxrwx 1 root root 4 Jun 29 2024 run -> /run
drwxr-xr-x 3 root root 4096 Jun 29 2024 spool
drwxrwxrwt 3 root root 4096 Oct 12 00:00 tmp
Right, and tmp
seems to be a symlink to /tmp
as well. That’s on the read-only filesystem and that’s surely going
to cause trouble too.
Oh, and there are some tmpfs filesystems already…
dcminter@worker-node-448a5bddd8ba:~$ mount | grep tmpfs
udev on /dev type devtmpfs (rw,nosuid,relatime,size=8114380k,nr_inodes=2028595,mode=755,inode64)
tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,size=1628108k,mode=755,inode64)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k,inode64)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=1628108k,nr_inodes=407027,mode=700,uid=1000,gid=1000,inode64)
So I can disregard lock
and run
because they’re already symlinking to tmpfs mounts.
I’m going to do this:
/tmp
and /var/log
are going to get their own tmpfs mounts.
- Cons
- This could cause issues if they fill up!
- If a node is crashing then it’s going to be a nuisance that the logs for a crashing node are lost
- Pros
- It’s easy to set up
- The contents are completely isolated
- They’ll be nice and fast
But by default the following mount-points will be mounted to directories within the worker-node’s /var/local
hierarchy
/var/backups
will become a symlink to/var/local/backups
/var/cache
will become a symlink to/var/local/cache
/var/lib
will become an NFS mount to/var/local/lib
/var/opt
will become an NFS mount to/var/local/opt
/var/spool
will become an NFS mount to/var/local/spool
For the remaining shareable directories I will create additional NFS mounts adjacent to /workers/home
on the cluster gateway:
/var/mail
will become an NFS mount to/workers/var/mail
/var/cache/man
will become an NFS mount to/workers/var/cache/man
/var/cache/fonts
will become an NFS mount to/workers/var/cache/fonts
/var/spool/news
will become an NFS mount to/workers/var/spool/news
(though I don’t foresee running NNTP here any time soon…)
Note that some of those overlap, but with this ordering it ought to be ok… I think?
Creating the tmpfs mounts
Adding the tmpfs mounts works fine - after a long digression because I’d forgotten to prefix the mount points
with /root
in the initrd mount script! Logs and tmp files are getting captured AOK (and this actually made it
clearer that systemd is managing logs, not the old syslogd). Here are the lines added to my mount_node_nfs
script:
# Mount the extra tmpfs filesystems
kprint "Mounting /tmp as a tmpfs with TMPSIZE=$TMPSIZE"
mount -t tmpfs -o "nodev,noexec,nosuid,size=${TMPSIZE:-5%},mode=0777" tmpfs /root/tmp
kprint "Mounting /var/log as a tmpfs with LOGSIZE=$LOGSIZE"
mount -t tmpfs -o "nodev,noexec,nosuid,size=${LOGSIZE:-5%},mode=0775" tmpfs /root/var/log
By default I’m allowing each one an extra 5% of memory by default - at the moment TMPSIZE
and LOGSIZE
aren’t set,
but I’m allowing for that to be overridden in some other config or boot command line parameters later.
Incidentally that problem where I’d forgotten to prefix the mountpoints with /root
was really annoying to debug,
because there weren’t any errors or anything - it’s just that there weren’t any tmpfs mounts in the booted system
after the pivot. Hopefully I won’t forget about that again!
Creating the non-shared var symlinks
Per the plan, I basically zapped all the original directories and re-symlinked them under the /clients
directory
on the gateway machine so that they’d appear as symlinks in the appropriate places once NFS mounted on the worker
nodes:
/clients/var/backups
gets symlinked to/var/local/backups
( Likely not all that important for my purposes… ) (deleted old contents in the original share)/clients/var/cache
gets symlinked to/var/local/cache
(deleted old contents in the original share)/clients/var/lib
gets symlinked to/var/local/lib
(deleted old contents in the original share)/clients/var/opt
gets symlinked to/var/local/opt
(deleted old directory, no contents)/clients/var/spool
gets symlinked to/var/local/spool
(deleted old contents in the original share)
If I do this but nothing else then systemd seems to get a bit upset … the worker nodes still boot but terribly slowly and there are some complaints in the dmesg log output:
...
[ 9.611374] systemd[1]: systemd-random-seed.service: Main process exited, code=exited, status=1/FAILURE
[ 9.611651] systemd[1]: systemd-random-seed.service: Failed with result 'exit-code'.
[ 9.611953] systemd[1]: Failed to start systemd-random-seed.service - Load/Save Random Seed.
[ 9.612479] systemd[1]: first-boot-complete.target - First Boot Complete was skipped because of an unmet condition check (ConditionFirstBoot=yes).
[ 9.621888] systemd[1]: Finished systemd-sysctl.service - Apply Kernel Variables.
[ 9.642473] systemd[1]: Finished systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev.
[ 9.642804] systemd[1]: Reached target local-fs-pre.target - Preparation for Local File Systems.
[ 9.643025] systemd[1]: Reached target local-fs.target - Local File Systems.
[ 9.645500] systemd[1]: systemd-binfmt.service - Set Up Additional Binary Formats was skipped because of an unmet condition check (ConditionPathIsMountPoint=/proc/sys/fs/binfmt_misc).
[ 9.645643] systemd[1]: systemd-machine-id-commit.service - Commit a transient machine-id on disk was skipped because of an unmet condition check (ConditionPathIsMountPoint=/etc/machine-id).
...
However, I then also added the following lines into the initrd image’s mount_node_nfs
script to create these
target directories under the share:
# Make sure the various /var/ mount points are OK
mkdir -p /root/var/local/backups
mkdir -p /root/var/local/cache
mkdir -p /root/var/local/lib
mkdir -p /root/var/local/opt
mkdir -p /root/var/local/spool
With this change the reboot runs swiftly, and after the reboot I can see that various additional subdirectories have
been populated in the worker-specific directories; before the changes the /workers
NFS share content was:
dcminter@cluster-gateway:/workers$ tree
.
├── 0023249434ae
│ ├── cmdline
│ ├── hostname
│ └── hosts
├── 448a5bddd8ba
│ ├── cmdline
│ ├── hostname
│ └── hosts
After modifying the init script and, recalling that the init script is only directly creating 10 (2 nodes x 5
additions) of these directories, the contents of the /workers
share has become:
dcminter@cluster-gateway:/workers$ tree
.
├── 0023249434ae
│ ├── backups
│ ├── cache
│ │ └── private [error opening dir]
│ ├── cmdline
│ ├── hostname
│ ├── hosts
│ ├── lib
│ │ ├── dbus
│ │ │ └── machine-id -> /etc/machine-id
│ │ ├── private [error opening dir]
│ │ └── systemd
│ │ ├── coredump
│ │ ├── linger
│ │ ├── pstore
│ │ ├── random-seed
│ │ └── timers
│ │ ├── stamp-apt-daily.timer
│ │ ├── stamp-apt-daily-upgrade.timer
│ │ ├── stamp-e2scrub_all.timer
│ │ ├── stamp-fstrim.timer
│ │ └── stamp-logrotate.timer
│ ├── opt
│ └── spool
│ └── cron
│ └── crontabs [error opening dir]
├── 448a5bddd8ba
│ ├── backups
│ ├── cache
│ │ └── private [error opening dir]
│ ├── cmdline
│ ├── hostname
│ ├── hosts
│ ├── lib
│ │ ├── dbus
│ │ │ └── machine-id -> /etc/machine-id
│ │ ├── private [error opening dir]
│ │ └── systemd
│ │ ├── coredump
│ │ ├── linger
│ │ ├── pstore
│ │ ├── random-seed
│ │ └── timers
│ │ ├── stamp-apt-daily.timer
│ │ ├── stamp-apt-daily-upgrade.timer
│ │ ├── stamp-e2scrub_all.timer
│ │ ├── stamp-fstrim.timer
│ │ └── stamp-logrotate.timer
│ ├── opt
│ └── spool
│ └── cron
│ └── crontabs [error opening dir]
├── home
│ ├── dcminter
│ └── root [error opening dir]
└── var
Ignore the “error opening dir” messages; that’s just because I didn’t run it as root, whereas on the workers some of those directories are being created for root’s eyes only.
Anyway, now the dmesg output has calmed down:
[ 9.648331] systemd[1]: Starting systemd-random-seed.service - Load/Save Random Seed...
[ 9.650213] systemd[1]: systemd-sysusers.service - Create System Users was skipped because no trigger condition checks were met.
[ 9.651264] systemd[1]: Starting systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev...
[ 9.656319] systemd[1]: Finished systemd-modules-load.service - Load Kernel Modules.
[ 9.657561] systemd[1]: Starting systemd-sysctl.service - Apply Kernel Variables...
[ 9.684091] systemd[1]: Finished systemd-random-seed.service - Load/Save Random Seed.
[ 9.684518] systemd[1]: first-boot-complete.target - First Boot Complete was skipped because of an unmet condition check (ConditionFirstBoot=yes).
[ 9.687169] systemd[1]: Finished systemd-sysctl.service - Apply Kernel Variables.
...
So that’s looking pretty solid.
Creating the shared var mountpoints
Next up the remaining NFS mountpoints that don’t need to be exclusive to the individual workers. Here I planned to copy across all of the original files rather than starting with empty directories.
I screwed up slightly though… I deleted the contents of /clients/var/cache
and /clients/var/spool
in the preceding step, so
I don’t have any contents to copy over for those two. However, checking the contents of the gateway server’s own OS
install (which, you may recall, is also a Debian 12 system) it seems like they’re not populated with any actual files
beyond auto-created ones, so I just create my new targets as empty directories, i.e. I create:
/workers/var/mail
/workers/var/cache/man
/workers/var/cache/fonts
/workers/var/spool/news
Then I add in the corresponding NFS mount points into the initrd image’s script:
# Mount the shared /var nfs paths
kprint "Mounting the shared /var nfs paths"
# /var/mail will already exist as an empty directory from the root nfs mount
nfsmount -o rw 192.168.0.254:/workers/var/mail /root/var/mail
mkdir -p /root/var/local/cache/man
nfsmount -o rw 192.168.0.254:/workers/var/cache/man /root/var/local/cache/man
mkdir -p /root/var/local/cache/fonts
nfsmount -o rw 192.168.0.254:/workers/var/cache/fonts /root/var/local/cache/fonts
mkdir -p /root/var/local/spool/news
nfsmount -o rw 192.168.0.254:/workers/var/spool/news /root/var/local/spool/news
After the pivot the /root/...
directories should correspond to those empty directories created
under /workers/...
on the gateway (NFS server).
I gave this a quick test by on the worker creating example files in e.g. /var/spool/news
and verifying that
on the gateway they materialise in /workers/var/spool/news
so that looks solid. None of these
are directories I really anticipate using in the kubernetes part of this project, but it’s good to get everything
looking at least roughly correct.
What’s left?
I do see the following lines from the logs emitted by journalctl
at this point…
Apr 13 17:03:42 worker-node-448a5bddd8ba systemd-tmpfiles[246]: "/var/lib" already exists and is not a directory.
Apr 13 17:03:42 worker-node-448a5bddd8ba systemd[1]: Started systemd-udevd.service - Rule-based Manager for Device Events and Files.
Apr 13 17:03:42 worker-node-448a5bddd8ba systemd-tmpfiles[246]: Failed to create directory or subvolume "/root", ignoring: Read-only file system
Apr 13 17:03:42 worker-node-448a5bddd8ba systemd-tmpfiles[246]: Failed to open path '/root', ignoring: No such file or directory
Apr 13 17:03:42 worker-node-448a5bddd8ba systemd-tmpfiles[246]: "/var/cache" already exists and is not a directory.
Apr 13 17:03:42 worker-node-448a5bddd8ba systemd-tmpfiles[246]: "/var/spool" already exists and is not a directory.
It’s quite true that these paths are no longer directories; they’re symlinks now. Consulting
the manpage for tmpfiles.d
it looks like I can
edit the files under /clients/usr/lib/tmpfiles.d
and adjust the configurations so that instead of managing
directories they manage symlinks. Better yet, the exact matches with those paths (rather than subdirectories under
them) are all managed from the var.conf
file there.
Editing that file, I make the following changes (these lines are all at the end of the file). Before my changes,
the d
character indicates a “directory to create and clean up”:
d /var/cache 0755 - - -
d /var/lib 0755 - - -
d /var/spool 0755 - - -
After my changes, the L
character indicates a “symlink to create”:
L /var/cache 0755 - - -
L /var/lib 0755 - - -
L /var/spool 0755 - - -
This knocks those errors from journalctl on the head successfully.
That server IP address
Just one more thing I want to clean up now - the big mount_node_nfs
script has the IP address of the NFS server
hard-coded into it. I’d prefer to take that directly from the incoming command line. On one of the worker nodes
that command line (readable from /proc/cmdline
you remember) ends up looking like this:
BOOT_IMAGE=vmlinuz-6.1.0-21-amd64 root=/dev/nfs nfsroot=192.168.0.254:/clients,ro ip=dhcp nfsrootdebug initrd=initrd-cluster-25_04_13_22_51_44_CEST ip=192.168.0.3:192.168.0.254:192.168.0.254:255.255.255.0 BOOTIF=01-44-8a-5b-dd-d8-ba CPU=6PVXL
The nfsroot parameter has to contain the address of the NFS server. At some point I will want to separate the NFS server from the Gateway server, so I’d rather take it from there than assume that the gateway part of the ip parameter (that has the node’s IP, the gateway’s IP, and the netmask in it) is the right value. A bit of sed magic should do the trick there…
sed -n 's/.* nfsroot=\([0-9.]*\).*/\1/p'
That assumes an IPv4 address and not a host name or an IPv6 address, but those do seem like safe assumptions here, I’m not going to add IPv6 to the already long-winded project.
First I pop in the sed script and a diagnostic output…
NFS_SERVER=$(/bin/sed -n 's/.* nfsroot=\([0-9.]*\).*/\1/p' /proc/cmdline)
kprint "NFS server IP address is $NFS_SERVER"
Checking the dmesg log after a reboot…
...
[ 8.892871] dcminter: NFS server IP address is 192.168.0.254
...
That looks right. Last thing to do is rewrite the explicit uses of 192.168.0.254 in the script to use the environment variable instead (just a search and replace in the script).
This bit worked first time! Here’s the full init script that I’ve ended up with so far:
#!/bin/sh -e
function kprint() {
echo "dcminter: $1" > /dev/kmsg
}
function nfs_mount_node() {
kprint 'About to attempt to mount the node share on nfs'
kprint "Boot variable is $BOOTIF"
kprint "rootmnt is $rootmnt"
SUFFIX=$(/bin/sed 's/.*BOOTIF=\(..\-..\-..\-..\-..\-..\-..\).*/\1/' /proc/cmdline | /bin/sed 's/..\-\(..\)\-\(..\)\-\(..\)\-\(..\)\-\(..\)\-\(..\)/\1\2\3\4\5\6/')
kprint "Target hostname directory has suffix $SUFFIX"
NFS_SERVER=$(/bin/sed -n 's/.* nfsroot=\([0-9.]*\).*/\1/p' /proc/cmdline)
kprint "NFS server IP address is $NFS_SERVER"
kprint "Dumping boot commandline"
cat /proc/cmdline > /dev/kmsg
# Note - Anything mounted under /root will be under / after the pivot!
kprint "Create the node directory under the /workers share"
mkdir -p /workers # Mount ephemerally - this mountpoint intentionally won't be around after the pivot!
nfsmount -o rw $NFS_SERVER:/workers /workers
kprint "Creating the node's directory if it does not already exist"
mkdir -p /workers/$SUFFIX
umount /workers
kprint "Unmounted /workers"
kprint "Mounting /workers/home to /root/home in preparation for pivot"
nfsmount -o rw $NFS_SERVER:/workers/home /root/home
kprint "Mounted /root/home"
kprint "Mount the node's directory as the writeable /root/var/local"
nfsmount -o rw $NFS_SERVER:/workers/$SUFFIX /root/var/local
kprint "Mounted /root/workers"
# Make sure the various /var/ mount points are OK
mkdir -p /root/var/local/backups
mkdir -p /root/var/local/cache
mkdir -p /root/var/local/lib
mkdir -p /root/var/local/opt
mkdir -p /root/var/local/spool
# Mount the shared /var nfs paths
kprint "Mounting the shared /var nfs paths"
# /var/mail will already exist as an empty directory from the root nfs mount
nfsmount -o rw $NFS_SERVER:/workers/var/mail /root/var/mail
mkdir -p /root/var/local/cache/man
nfsmount -o rw $NFS_SERVER:/workers/var/cache/man /root/var/local/cache/man
mkdir -p /root/var/local/cache/fonts
nfsmount -o rw $NFS_SERVER:/workers/var/cache/fonts /root/var/local/cache/fonts
mkdir -p /root/var/local/spool/news
nfsmount -o rw $NFS_SERVER:/workers/var/spool/news /root/var/local/spool/news
# Mount the extra tmpfs filesystems
kprint "Mounting /tmp as a tmpfs with TMPSIZE=$TMPSIZE"
mount -t tmpfs -o "nodev,noexec,nosuid,size=${TMPSIZE:-5%},mode=0777" tmpfs /root/tmp
kprint "Mounting /var/log as a tmpfs with LOGSIZE=$LOGSIZE"
mount -t tmpfs -o "nodev,noexec,nosuid,size=${LOGSIZE:-5%},mode=0775" tmpfs /root/var/log
# Set the hostname (and make it sticky)
kprint "Set the hostname"
hostname "worker-node-$SUFFIX"
echo "worker-node-$SUFFIX" > /root/var/local/hostname
kprint "Hostname should be worker-node-$SUFFIX now"
kprint "Adding loopback resolution for hostname"
cp /etc/hosts /root/var/local/hosts
echo "127.0.0.1 worker-node-$SUFFIX" >> /root/var/local/hosts
echo "::1 worker-node-$SUFFIX" >> /root/var/local/hosts
kprint "Appropriate hosts file created."
kprint "Write cmdline to var mount to make ip address identification easier from outside the worker node"
cat /proc/cmdline > /root/var/local/cmdline
kprint "Written cmdline"
}
kprint 'We ran a boot script after mounting root'
nfs_mount_node
Here are all of the diagnostic outputs from that script that end up in the dmesg log after booting one of the worker nodes:
[ 8.882392] dcminter: We ran a boot script after mounting root
[ 8.882490] dcminter: About to attempt to mount the node share on nfs
[ 8.882562] dcminter: Boot variable is 01-00-23-24-94-34-ae
[ 8.882631] dcminter: rootmnt is /root
[ 8.883791] dcminter: Target hostname directory has suffix 0023249434ae
[ 8.884616] dcminter: NFS server IP address is 192.168.0.254
[ 8.884689] dcminter: Dumping boot commandline
[ 8.885342] BOOT_IMAGE=vmlinuz-6.1.0-21-amd64 root=/dev/nfs nfsroot=192.168.0.254:/clients,ro ip=dhcp nfsrootdebug initrd=initrd-cluster-25_04_13_23_35_12_CEST ip=192.168.0.4:192.168.0.254:192.168.0.254:255.255.255.0 BOOTIF=01-00-23-24-94-34-ae CPU=6PVXL
[ 8.885596] dcminter: Create the node directory under the /workers share
[ 8.891270] dcminter: Creating the node's directory if it does not already exist
[ 8.925022] dcminter: Unmounted /workers
[ 8.925099] dcminter: Mounting /workers/home to /root/home in preparation for pivot
[ 8.930581] dcminter: Mounted /root/home
[ 8.930656] dcminter: Mount the node's directory as the writeable /root/var/local
[ 8.957847] dcminter: Mounted /root/workers
[ 8.968580] dcminter: Mounting the shared /var nfs paths
[ 9.065769] dcminter: Mounting /tmp as a tmpfs with TMPSIZE=
[ 9.066910] dcminter: Mounting /var/log as a tmpfs with LOGSIZE=
[ 9.068002] dcminter: Set the hostname
[ 9.070544] dcminter: Hostname should be worker-node-0023249434ae now
[ 9.070619] dcminter: Adding loopback resolution for hostname
[ 9.074700] dcminter: Appropriate hosts file created.
[ 9.074777] dcminter: Write cmdline to var mount to make ip address identification easier from outside the worker node
[ 9.076708] dcminter: Written cmdline
The script could be a bit more elegant - it’s a bit repetetive and parts of it likely ought to be split out into their own dedicated init scripts, but for now this will do.
I don’t suppose it’s the last time I’ll rebuild the initd image, but it’s a good point to move on from.
How sad are we now?
But wait, are those systemd units all working ok now?
dcminter@worker-node-0023249434ae:~$ systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
0 loaded units listed.
Yes. Much less sad!
Moar Contacts!
Since I posted the last entry in this series on setting up the cluster I got another note, this time from Alexander who’s setting up something similar in overall effect, but likely a lot more robust in actual implementation. His approach is based more around overlay filesystems.
Very nice to get these alternative perspectives on a similar project - and interesting that there seem to be quite a few lone enthusiasts setting up remote booted clusters like mine.
Next
This part felt more like tidying up than anything really tricky. It took me a while to get back to finish it with other distractions, but it was pretty plain sailing once I did.
I’m pretty sure it’s going to take more than one part for the rest of getting Kubernetes up and running on this little cluster of machines. One thing I can see might be a problem is that I’ve not left myself an easy way to install new software at the OS level easily on these systems - but it wouldn’t be fun without a few bumps in the road. That said, it would not surprise me at all if I have to back track a bit and try something like the approach Alexander adopted with overlay filesystems. We’ll see.
So. Coming soon for extremely large values of elapsed time for soon: Part 6 - Kubernetes at last
Footnotes
¹ I have a nasty feeling that this is foreshadowing and that actually I will totally forget that and have a frickin' nightmare figuring out that this is at the root of some config black hole. You can laugh at me when I get there if so.