Skip to content
This repository has been archived by the owner on Jan 18, 2024. It is now read-only.

Commit

Permalink
Use a custom liveness probe to handle custom cleanup on exit
Browse files Browse the repository at this point in the history
I believe kudos goes to Danil for this idea originally.
  • Loading branch information
thedodd committed Dec 8, 2023
1 parent ed251df commit de4476d
Show file tree
Hide file tree
Showing 3 changed files with 70 additions and 19 deletions.
50 changes: 50 additions & 0 deletions charts/timescaledb-single/scripts/liveness_probe.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/bin/bash
#
# This liveness probe, as used in `statefulset-timescaledb.yaml` for the `timescaledb` container,
# will only be executed after the readiness probe has reckoned a successful readiness state.
# As such, our first order of operation is to check if pg_isready. If so, then we just return.
# If it is determined that PG is not ready, then we proceed to check a few other conditions
# to determine if it is safe for us to reckon a lack of liveness.
#
# This script can also be extended with custom logic. The first use case for custom logic is to
# optionally shutdown the linkerd-proxy sidecar of this pod when running with linkerd.

# First check if pg_isready, just like the readiness probe.
if pg_isready -h /var/run/postgresql; then
echo "pg is ready, exit 0"
exit 0
fi

# PG is not ready, then check patroni.
if curl -s -f -XGET http://localhost:8008/liveness; then
echo "patroni is live, exit 0"
exit 0
fi

# So far, PG is not ready, and patroni is either gone or reporting a non 2xx status.
# Check to see if the patroni process is still around.
if pgrep -f "patroni"; then
echo "patroni is still kicking, exit 0"
exit 0
fi

# NOTE that pgbackrest archival of WAL is spawned by the postgres process which is managed
# by patroni. If there is no patroni process, then we shouldn't have any child processes
# still around. As such, the last pgrep for patroni should obviate the need for directly
# checking for archival processes still lingering about.
#
# However, there is such a thing as zombie processes ... so let's just check anyway.
if pgrep "pgbackrest"; then
echo "pgbackrest is still kicking, exit 0"
exit 0
fi

echo "PG is not ready, patroni process is gone, no pgbackrest operations detected, this thing is dead"

# First param indicates that we should attempt to shutdown the local linkerd-proxy on exit.
if [[ $1 == "1" ]]; then
echo "executing custom cleanup routine, shutting down local linkerd-proxy"
curl -s -m 5 -X POST http://localhost:4191/shutdown
fi

exit 1
31 changes: 12 additions & 19 deletions charts/timescaledb-single/templates/statefulset-timescaledb.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -102,25 +102,6 @@ spec:
# we can still serve clients.
terminationGracePeriodSeconds: 600
containers:
{{- if .Values.linkerd.adminShutdownOnExit }}
- name: linkerd-shutdown
securityContext:
allowPrivilegeEscalation: false
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
command:
- /bin/bash
- "-c"
- |
while true; do
ret=`curl -s -m -X GET http://localhost:8008/liveness`
if [ $ret != 0 ]; break; fi
sleep 10
done
# Once Patroni exits, when running with a linkerd-proxy sidecar, we call linkerd-proxy shutdown.
curl -s -m 5 -X POST http://localhost:4191/shutdown
{{- end }}
- name: timescaledb
securityContext:
allowPrivilegeEscalation: false
Expand Down Expand Up @@ -290,6 +271,18 @@ spec:
successThreshold: {{ .Values.readinessProbe.successThreshold }}
failureThreshold: {{ .Values.readinessProbe.failureThreshold }}
{{- end }}
{{- if .Values.livenessProbe.enabled }}
livenessProbe:
exec:
command:
- "{{ template "scripts_dir" . }}/liveness_probe.sh"
- "{{ if .Values.linkerd.adminShutdownOnExit }}1{{else}}0{{end}}"
initialDelaySeconds: {{ .Values.livenessProbe.initialDelaySeconds }}
periodSeconds: {{ .Values.livenessProbe.periodSeconds }}
timeoutSeconds: {{ .Values.livenessProbe.timeoutSeconds }}
successThreshold: {{ .Values.livenessProbe.successThreshold }}
failureThreshold: {{ .Values.livenessProbe.failureThreshold }}
{{- end }}
volumeMounts:
- name: storage-volume
mountPath: {{ .Values.persistentVolumes.data.mountPath | quote }}
Expand Down
8 changes: 8 additions & 0 deletions charts/timescaledb-single/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -362,6 +362,14 @@ readinessProbe:
failureThreshold: 6
successThreshold: 1

livenessProbe:
enabled: false
initialDelaySeconds: 15
periodSeconds: 10
timeoutSeconds: 10
failureThreshold: 1
successThreshold: 1

persistentVolumes:
# For sanity reasons, the actual PGDATA and wal directory will be subdirectories of the Volume mounts,
# this allows Patroni/a human/an automated operator to move directories during bootstrap, which cannot
Expand Down

0 comments on commit de4476d

Please sign in to comment.