标签
PostgreSQL , 重启 , 信号 , postmaster.pid , pg_reload_conf()
背景
如何通过SQL接口直接关闭数据库,或者重启数据库?
关闭和重启数据库是一个风险较大的操作,如果能通过SQL来关闭,重启数据库,当然是很难想象的,因为SQL通常是使用接口,而不是管理接口。当然并不是数据库做不到通过SQL管理数据库,而是这确实是风险较大且并不是数据库核心的能努力。
但是为了方便管理,数据库还是提供了很多管理函数(通过SQL调用)。例如:
https://www.postgresql.org/docs/11/functions-info.html
那么能不能通过SQL接口来关闭,或者重启数据库呢?(通常我们需要登陆到数据库所在的操作系统,执行pg_ctl来实现)
关闭数据库的底层实现
实际上关闭数据库是往postgres master进程(数据库启动时的父进程)发送信号,进程在收到信号后会进行相应的操作。可以通过看postmaster.c代码或通过man postgres得到这个信息:
man postgres
To terminate the postgres server normally, the signals SIGTERM, SIGINT, or SIGQUIT can be used.
The first will wait for all clients to terminate before quitting,
the second will forcefully disconnect all clients,
and the third will quit immediately without proper shutdown, resulting in a recovery run during restart.
如何获得postmaster进程pid呢?
直接读postmaster.pid文件即可得到:
postgres=# select * from pg_read_file('postmaster.pid');
pg_read_file
----------------------------
30503 +
/data01/digoal/pg_root8001+
1549031862 +
8001 +
. +
0.0.0.0 +
8001001 39288833 +
ready +
(1 row)
30503 为postmaster进程的PID。
关闭数据库就是往这个PID发送信号(SIGTERM 正常关闭, SIGINT 快速关闭, or SIGQUIT 暴力关闭)。
发送信号给数据库进程
src/backend/utils/adt/misc.c
1、发送给postmaster进程SIGHUP信号,用于reload conf。
/*
* Signal to reload the database configuration
*
* Permission checking for this function is managed through the normal
* GRANT system.
*/
Datum
pg_reload_conf(PG_FUNCTION_ARGS)
{
if (kill(PostmasterPid, SIGHUP))
{
ereport(WARNING,
(errmsg("failed to send signal to postmaster: %m")));
PG_RETURN_BOOL(false);
}
PG_RETURN_BOOL(true);
}
2、发送给普通进程,用于cancel query或terminate session
/*
* Signal to terminate a backend process. This is allowed if you are a member
* of the role whose process is being terminated.
*
* Note that only superusers can signal superuser-owned processes.
*/
Datum
pg_terminate_backend(PG_FUNCTION_ARGS)
{
int r = pg_signal_backend(PG_GETARG_INT32(0), SIGTERM);
if (r == SIGNAL_BACKEND_NOSUPERUSER)
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
(errmsg("must be a superuser to terminate superuser process"))));
if (r == SIGNAL_BACKEND_NOPERMISSION)
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
(errmsg("must be a member of the role whose process is being terminated or member of pg_signal_backend"))));
PG_RETURN_BOOL(r == SIGNAL_BACKEND_SUCCESS);
}
/*
* Signal to cancel a backend process. This is allowed if you are a member of
* the role whose process is being canceled.
*
* Note that only superusers can signal superuser-owned processes.
*/
Datum
pg_cancel_backend(PG_FUNCTION_ARGS)
{
int r = pg_signal_backend(PG_GETARG_INT32(0), SIGINT);
if (r == SIGNAL_BACKEND_NOSUPERUSER)
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
(errmsg("must be a superuser to cancel superuser query"))));
if (r == SIGNAL_BACKEND_NOPERMISSION)
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
(errmsg("must be a member of the role whose query is being canceled or member of pg_signal_backend"))));
PG_RETURN_BOOL(r == SIGNAL_BACKEND_SUCCESS);
}
src/backend/utils/adt/misc.c
/*
* Send a signal to another backend.
*
* The signal is delivered if the user is either a superuser or the same
* role as the backend being signaled. For "dangerous" signals, an explicit
* check for superuser needs to be done prior to calling this function.
*
* Returns 0 on success, 1 on general failure, 2 on normal permission error
* and 3 if the caller needs to be a superuser.
*
* In the event of a general failure (return code 1), a warning message will
* be emitted. For permission errors, doing that is the responsibility of
* the caller.
*/
#define SIGNAL_BACKEND_SUCCESS 0
#define SIGNAL_BACKEND_ERROR 1
#define SIGNAL_BACKEND_NOPERMISSION 2
#define SIGNAL_BACKEND_NOSUPERUSER 3
static int
pg_signal_backend(int pid, int sig)
{
。。。
if (proc == NULL)
{
/*
* This is just a warning so a loop-through-resultset will not abort
* if one backend terminated on its own during the run.
*/
ereport(WARNING,
(errmsg("PID %d is not a PostgreSQL server process", pid)));
return SIGNAL_BACKEND_ERROR;
}
。。。
PG内部并没有开放一个SQL接口来停库,所以我们需要自己写一个
vi pg_fast_stop.c
#include <signal.h>
#include "fmgr.h"
#include "postgres.h"
PG_MODULE_MAGIC;
PG_FUNCTION_INFO_V1(pg_fast_stop);
Datum
pg_fast_stop(PG_FUNCTION_ARGS)
{
if (kill(PostmasterPid, SIGINT))
{
ereport(WARNING,
(errmsg("failed to send signal to postmaster: %m")));
PG_RETURN_BOOL(false);
}
PG_RETURN_BOOL(true);
}
gcc -O3 -Wall -Wextra -I /home/digoal/postgresql-11.1/src/include -g -fPIC -c ./pg_fast_stop.c -o pg_fast_stop.o
gcc -O3 -Wall -Wextra -I /home/digoal/postgresql-11.1/src/include -g -shared pg_fast_stop.o -o libpg_fast_stop.so
cp libpg_fast_stop.so $PGHOME/lib/
psql
create or replace function pg_fast_stop() returns int as '$libdir/libpg_fast_stop.so', 'pg_fast_stop' language C STRICT;
试用:
postgres=# select pg_fast_stop();
pg_fast_stop
------------
1
(1 row)
数据库已关机
postgres=# \dt
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q
如何实现SQL接口重启数据库呢?
因为往POSTMASTER PID发送信号只能关闭数据库,无法重启数据库。那么怎么实现重启呢?
1、#restart_after_crash = on # reinitialize after backend crash?
利用普通用户进程被KILL -9来自动重启,这个是postmaster守护进程自动执行的重启动作。
2、利用plsh存储过程语言,直接调用pg数据库操作系统的pg_ctl命令来重启。
https://github.com/petere/plsh
参考
https://github.com/petere/plsh
https://www.postgresql.org/docs/11/functions-info.html
https://www.postgresql.org/docs/11/functions-admin.html
src/backend/utils/adt/misc.c