Skip to content

Conversation

DougReeder
Copy link

What?

The backup and restore scripts will, when the pgsql pod does not exist, log a message to stderr and continue backing up or restoring the reticulum files.
Also handles empty resource blocks in hcce.yaml.
Also extracts IP address of all load balancers

Why?

An instance using an external database (https://hominidsoftware.com/tech-personal-growth/Hubs-Managed-Databse/Hubs-Managed-Database/) will not have a pgsql pod.
Also, a damaged instance might not be running the pgsql pod.
There is still value in backing up and/or restoring just the reticulum files.

A modern ingress controller might not be in the hcce namespace

Examples

On hubs.hominidsoftware.com, which has an external database (and so lacks the pgsql pod), uses the current version of HAProxy ingress controller in the namespace haproxy-controller (so the LoadBalancer service is also in that namespace), and has a section of hcce.yaml with everything between two sets of --- commented out:

node restore_backup_script/index.js

maintenance-mode-hcce.yaml file generated successfully.
applying maintenance mode

deployment.apps "coturn" deleted
deployment.apps "dialog" deleted
deployment.apps "hubs" deleted
deployment.apps "nearspark" deleted
deployment.apps "photomnemonic" deleted
deployment.apps "reticulum" deleted
deployment.apps "spoke" deleted
pod "coturn-74d6cdb5b4-c9x9z" deleted
pod "dialog-6f56f69c55-ztzv2" deleted
pod "hubs-85487ddc9c-ln8m4" deleted
pod "nearspark-795986bd6b-grc76" deleted
pod "photomnemonic-85ff5bf8d5-js8fv" deleted
pod "reticulum-c7c6f67bd-6jhs8" deleted
pod "spoke-7659644cc5-w97dq" deleted
namespace/hcce unchanged
secret/configs configured
persistentvolumeclaim/ret-pvc unchanged
ingress.networking.k8s.io/ret-modern configured
ingress.networking.k8s.io/dialog-modern configured
ingress.networking.k8s.io/nearspark-modern configured
configmap/ret-config unchanged
deployment.apps/reticulum created
service/ret unchanged
deployment.apps/hubs created
service/hubs unchanged
deployment.apps/spoke created
service/spoke unchanged
deployment.apps/nearspark created
service/nearspark unchanged
deployment.apps/photomnemonic created
service/photomnemonic unchanged
deployment.apps/dialog created
service/dialog unchanged
deployment.apps/coturn created
service/coturn unchanged
waiting on coturn, dialog, hubs, nearspark, photomnemonic, reticulum, spoke
waiting on coturn, dialog, photomnemonic, reticulum
waiting on reticulum
maintenance mode applied
pgsql pod not found

restoring backup

restoring Reticulum '._cached' folder
restoring Reticulum '._expiring' folder
restoring Reticulum '._owned' folder
restoring Reticulum '._storage' folder
restoring Reticulum 'cached' folder
restoring Reticulum 'expiring' folder
restoring Reticulum 'lost+found' folder
restoring Reticulum 'owned' folder
not restoring pgsql

restarting instance
deployment.apps "coturn" deleted
deployment.apps "dialog" deleted
deployment.apps "hubs" deleted
deployment.apps "nearspark" deleted
deployment.apps "photomnemonic" deleted
deployment.apps "reticulum" deleted
deployment.apps "spoke" deleted
pod "coturn-74d6cdb5b4-n4cqx" deleted
pod "dialog-6f56f69c55-xclvq" deleted
pod "hubs-85487ddc9c-6ldlw" deleted
pod "nearspark-795986bd6b-t49q7" deleted
pod "photomnemonic-85ff5bf8d5-56hsq" deleted
pod "reticulum-c7c6f67bd-flbq2" deleted
pod "spoke-7659644cc5-27z9m" deleted

script@1.0.0 apply
node apply/index.js && node get_ip/index.js

namespace/hcce unchanged
secret/configs configured
persistentvolumeclaim/ret-pvc unchanged
ingress.networking.k8s.io/ret-modern configured
ingress.networking.k8s.io/dialog-modern configured
ingress.networking.k8s.io/nearspark-modern configured
configmap/ret-config unchanged
deployment.apps/reticulum created
service/ret unchanged
deployment.apps/hubs created
service/hubs unchanged
deployment.apps/spoke created
service/spoke unchanged
deployment.apps/nearspark created
service/nearspark unchanged
deployment.apps/photomnemonic created
service/photomnemonic unchanged
deployment.apps/dialog created
service/dialog unchanged
deployment.apps/coturn created
service/coturn unchanged
waiting on coturn, dialog, hubs, nearspark, photomnemonic, reticulum, spoke
waiting on coturn, reticulum
waiting on reticulum
all deployments ready
load balancer external IP address: 146.190.190.57

How to test

  1. run kubectl scale deployment pgsql -n hcce --replicas=0
  2. run npm run backup, observe that it creates files in community-edition/data_backups/data_backup_999999/reticulum_storage_data, but not pgsql files
  3. run npm run restore-backup, observe that it runs to completion
  4. run kubectl scale deployment pgsql -n hcce --replicas=1
  5. run backup and restore scripts, and observe they back up and restore both reticulum and pgsql files

Documentation of functionality

A paragraph has been added to the section of the readme on backup and restore. People running an external database presumably know that, and the script output should be clear.

Limitations

Backing up an external database must be done separately.

Open questions

backing and restoring up an external postgresql database might or might not fit in these scripts

Why: An instance using an external database (https://hominidsoftware.com/tech-personal-growth/Hubs-Managed-Databse/Hubs-Managed-Database/) will not have a pgsql pod.
Also, a damaged instance might not be running the pgsql pod.
There is still value in backing up and/or restoring just the reticulum files.

Also handles empty blocks in `hcce.yaml`.
Also extracts IP address of all load balancers, as a modern ingress controller might not be in the `hcce` namespace.

Open Question: backing and restoring up an external postgresql database might or might not fit in these scripts
Copy link
Owner

@Exairnous Exairnous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the fixes we talked about at the dev meeting, this is looking pretty good. Thanks.

There's one thing I left a comment on, and I still need to run some QA tests.

I'll comment again once I've done the QA tests.

Copy link
Owner

@Exairnous Exairnous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done my QA tests. I found one bug, which is noted in the inline comment, but aside from that everything looks good. Thanks.

Note for anyone looking back at this PR (this is just information and isn't part of the review):
In order to test without the pgsql pod in the restore script, I had to modify the script and add in a couple lines to scale the pgsql pod down again and wait 10 seconds, so the pod has time to finish scaling down, after the maintenance mode is applied (applying the maintenance mode automatically brings the pgsql pod back up, so then it looks for the database dump and fails when it can't find it).

Why: If there is more than one load balancer in the cluster, the user needs to select the appropriate one.
Copy link
Owner

@Exairnous Exairnous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks. Merging.

@Exairnous Exairnous merged commit aff6c9c into Exairnous:backup-restore-scripts Aug 15, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants