Kubernetes Terminating Issue
When using Kubernetes, it is too easy to skip over troublesome because my application seems to work by excellent (or swindling) Kubernetes fault-tolerance! This issue is one of such issues that we just let it go.
As shown in kubectl get po result, the pod status doesn't change from 1/1 to 0/1 even when the pod disappears from the list.
$ kubectl get po esevan-park-e9c9158b\-8348\-47d3\-9b8b-a41df430b1c9 1/1 Terminating 0 2m15s
The pod status says "1 out of 1 container(s) inside the pod is still running, but I'm trying to terminate it!".
Technically speaking, the process that is specified as Container Entrypoint is still running since it didn't recognize that it had been killed.
I have experienced the following situations raised by the above issue.
- When creating a replica of the stateful application for disaster recovery or service update, the non-terminating process raises Split brain issue.
- When updating service, HTTP requests are dropped by unexpected SIGKILL
- Although I expected instant pod termination, it's delayed by graceful termination period (default: 30sec)
- When a container occupies some resources which are not culled by docker daemon (e.g. mounted volume), the container is not deleted but remains in docker daemon.
Specifically, some projects that don't have Kubernetes-friendly features may have a constraint to run the application: Only one instance should run at the same time. (I hope to deal with what Kubernetes-friendly features are soon)
If using "Recreate" deployment strategy to resolve the split-brain issue raised by that constraint, the application would get downtime for more than 30 seconds until a new pod is up and running because the endpoint is deleted from etcd in the pod terminating status.
The case is intermittent and doesn't always raise other big issues, but... I want to share what is the root cause of this issue.
Kubernetes Pod Termination Process
The brief process of Kuberneted pod termination is as below.
- Request terminating the pod. (Grace period equals to 30 seconds)
- Change the pod status from Running to Terminating.
- At the same time as step 2, trigger the "preStop" hook and send a SIGTERM signal to process 1.
- At the same time as step 2, the pod endpoint is deleted from the service endpoint. It means the requests to the service endpoint cannot be forwarded to the process running inside the pod.
- After the grace period, the container is forcefully killed by SIGKILL and Kubernetes pod object is deleted.
Let's see more detail of step 3.
SIGTERM Signal
If you haven't implemented a SIGTERM handler in the main process, the language-specific default SIGTERM handler will be invoked, which typically results in immediate process termination—similar to SIGKILL.
This may not cause an error per se, but it prevents graceful shutdown, meaning any queued requests will not be processed, and users may receive 5xx responses, effectively causing API downtime.
If you're using a service manager framework, most of them support handlers internally. If not, you can register graceful shutdown logic directly using the signal() system call.
Python Example:
import os
import subprocess
import signal
import time
class GracefulKiller:
kill_now = False
def __init__(self):
signal.signal(signal.SIGINT, self.exit_gracefully)
signal.signal(signal.SIGTERM, self.exit_gracefully)
def exit_gracefully(self, signum, frame):
print(f'{signum} signal has been trapped. Cleaning my room before you kick me out.')
time.sleep(1)
self.kill_now = True
if __name__ == '__main__':
killer = GracefulKiller()
while True:
time.sleep(1)
print("doing something in a loop ...")
if killer.kill_now:
break
print("End of the program. I was killed gracefully :)")
SIGTERM Sent to PID 1
If the process is started using the CMD instruction in the Dockerfile, the container runs /bin/sh as PID 1, and the program defined in CMD is forked as a child (i.e., not PID 1). Similarly, if a shell script is used at container startup, the shell becomes PID 1 and forks the main process.
In this setup, SIGTERM is sent only to /bin/sh or the defined shell, not to the main process. This prevents the main process from handling SIGTERM and invoking graceful shutdown logic, leading to a situation where resources aren't properly released after the grace period and the process is forcefully killed via SIGKILL. (In some cases, SIGKILL may also not be properly delivered to the main process.)
Bad Practice Example – Main Process PID is 57:
root@esevan-park-a5d58818-292b-48af-852f-7e133c9511d2:/home/esevan.park# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 16:33 ? 00:00:00 /bin/sh -c /usr/local/bin/bootstrap-kernel.sh
root 8 1 0 16:33 ? 00:00:00 /bin/bash /usr/local/bin/bootstrap-kernel.sh
root 17 1 0 16:33 ? 00:00:00 /usr/sbin/sshd
esevan.+ 57 8 12 16:33 ? 00:00:00 python /usr/local/bin/kernel-launchers/python/scripts/launch_ipykerne
esevan.+ 65 57 0 16:33 ? 00:00:00 python /usr/local/bin/kernel-launchers/python/scripts/launch_ipykerne
root 76 0 0 16:33 pts/0 00:00:00 bash
root 87 76 0 16:33 pts/0 00:00:00 ps -ef
5 Solutions
To address this, either modify the Dockerfile or the startup script:
- Use ENTRYPOINT instead of CMD in the Dockerfile to ensure the target process is PID 1.
- In scripts, replace direct executable invocation with exec to run the process without forking, making it PID 1:
# Bad
#!/bin/bash
Executable
# Good
#!/bin/bash
Executable &
sid=($!)
trap "echo \"SIGTERM to ${sid}\"; kill -SIGTERM ${sid}" SIGTERM
trap "echo \"SIGHUP to ${sid}\"; kill -SIGHUP ${sid}" SIGHUP
trap "echo \"SIGINT to ${sid}\"; kill -SIGINT ${sid}" SIGINT
wait $sid
3. If user switching is needed in the script, use gosu instead of sudo or su.
4. Trap and forward signals within the script to ensure the actual application handles them.
5. Use Kubernetes preStop hook to explicitly trigger shutdown logic before SIGTERM is sent to the container’s PID 1:
containers:
- name: user-container
lifecycle:
preStop:
exec:
command: ['/bin/terminate.sh']
According to Kubernetes documentation, HTTP-based preStop hooks are also supported, which may be useful in some cases.
Graceful shutdown allows the application to detect termination and respond accordingly, providing significant advantages in terms of reliability and user experience.
If you've encountered other issues or solutions related to graceful shutdown, please share.
'English' 카테고리의 다른 글
| How We Rebuilt Our Agile Process to Fight Burnout and Restore Focus (1) | 2025.05.15 |
|---|