最近优化了一版程序:用到了golang的优雅退出机制。
程序使用etcd的election sdk
做高可用选主,需要在节点意外下线的时候,主动去etcd卸任(删除10s租约), 否则已经下线的节点还会被etcd认为是leader。
所以在这里,优雅退出是技术刚需。
另外根据《云原生十二要素方法论》 第9条: 快速启动和优雅终止可最大化健壮性 , 也推荐各位遵守实践。
Fast startup and shutdown are advocated for a more robust and resilient system.
粗浅的认知方案:捕获程序的终止信号, 主动去卸任。
标准信号[1] Linux支持如下标准信号,第二列指示该信号遵守的标准。
Signal Standard Action Comment ──────────────────────────────────────────────────────────────────────── SIGABRT P1990 Core Abort signal from abort(3) SIGALRM P1990 Term Timer signal from alarm(2) SIGBUS P2001 Core Bus error (bad memory access) SIGCHLD P1990 Ign Child stopped or terminated SIGCLD - Ign A synonym for SIGCHLD SIGCONT P1990 Cont Continue if stopped SIGEMT - Term Emulator trap SIGFPE P1990 Core Floating-point exception SIGHUP P1990 Term Hangup detected on controlling terminal or death of controlling process SIGILL P1990 Core Illegal Instruction SIGINFO - A synonym for SIGPWR SIGINT P1990 Term Interrupt from keyboard SIGIO - Term I/O now possible (4.2BSD) SIGIOT - Core IOT trap. A synonym for SIGABRT SIGKILL P1990 Term Kill signal SIGLOST - Term File lock lost (unused) SIGPIPE P1990 Term Broken pipe: write to pipe with no readers; see pipe(7) SIGPOLL P2001 Term Pollable event (Sys V); synonym for SIGIO SIGPROF P2001 Term Profiling timer expired SIGPWR - Term Power failure (System V) SIGQUIT P1990 Core Quit from keyboard SIGSEGV P1990 Core Invalid memory reference SIGSTKFLT - Term Stack fault on coprocessor (unused) SIGSTOP P1990 Stop Stop process SIGTSTP P1990 Stop Stop typed at terminal SIGSYS P2001 Core Bad system call (SVr4); see also seccomp(2) SIGTERM P1990 Term Termination signal SIGTRAP P2001 Core Trace/breakpoint trap SIGTTIN P1990 Stop Terminal input for background process SIGTTOU P1990 Stop Terminal output for background process SIGUNUSED - Core Synonymous with SIGSYS SIGURG P2001 Ign Urgent condition on socket (4.2BSD) SIGUSR1 P1990 Term User-defined signal 1 SIGUSR2 P1990 Term User-defined signal 2 SIGVTALRM P2001 Term Virtual alarm clock (4.2BSD) SIGXCPU P2001 Core CPU time limit exceeded (4.2BSD); see setrlimit(2) SIGXFSZ P2001 Core File size limit exceeded (4.2BSD); see setrlimit(2) SIGWINCH - Ign Window resize signal (4.3BSD, Sun)
其中SIGKILL
,SIGSTOP
信号不能被捕获、阻塞、忽略。
我们常见的三种终止程序的操作:
1.CTRL+C
实际是发送SIGINT
信号,
2.kill pid
的作用是向指定进程发送SIGTERM
信号(这是kill默认发送的信息), 若应用程序没有捕获并响应该信号的逻辑,则该信号默认动作是kill掉进程,这是终止进程的推荐做法。
3.kill -9 pid
则是向指定进程发送SIGKILL
信号,SIGKILL信号既不能被应用程序捕获,也不能被阻塞或忽略,
故要达成我们的目的,这里捕获 SIGINT
SIGTREM
信号就可满足需求。
golang提供signal
包来监听并反馈收到的信号。
可针对长时间运行的程序,新开协程,持续监听信号,并插入优雅关闭的代码。
c := make(chan os.Signal) signal.Notify(c, syscall.SIGTERM, syscall.SIGINT) go func() { select { case sig:= <-c: { log.Infof("Got %s signal. Aborting...\n", sig) eCli.Close() // 利用 etcd election sdk主动卸任 os.Exit(1) } } }()
是不是依旧适配容器?
我们得看DOCKER官方docker stop
,docker kill
命令的定义。
docker stop[2]: The main process inside the container will receiver SIGTREM, and after a grace period,SIGKILL .(default grace period =10s)
docker kill[3]:The main process inside the container is sent SIGKILL signal (default), or the signal that is specified with the --signal option
我们常用的docker stop命令:向容器内进程发送SIGTREM
信号,10s后发送SIGKILL
信号,这10s时间给了程序做优雅关闭的时机,所以上面代码的逻辑是能适配容器的。