linux ha 里pgsql的promote超时_问答-阿里云开发者社区

我这边是部署了suse linux ha 的一套主从PGSQL数据库，
然后有一天主库monitor 超时，导致发起了关闭主库并且promote备库。
但是promote 备库超时了。而备库的pg_log的日志已经被后来恢复的时候删掉了。
请问有什么方法可以看到当时为什么promote超时，还有为什么主库会monitor超时。
这些东西要从哪里入手？我在corosync的日志里看不出具体的原因，只看到触发了什么操作等.

ps:主库monitor超时的时候的pg_log ，显示做了backup.
01:28:06 [unknown] postgres NOTICE: pg_stop_backup cleanup done, waiting for required WAL segments to be archived
01:28:23 [unknown] postgres NOTICE: pg_stop_backup complete, all required WAL segments have been archived
01:28:25 LOG: received fast shutdown request

promote并没有超时的说法，建议你再梳理一下corosync的流程。包括这个备份信息是不是corosync切换流程中的一环。
另外再给你一个信息, promote分两种情况，一种需要做检查点，另一种不需要。

                        if (fast_promote)
                        {
                                checkPointLoc = ControlFile->prevCheckPoint;

                                /*
                                 * Confirm the last checkpoint is available for us to recover
                                 * from if we fail. Note that we don't check for the secondary
                                 * checkpoint since that isn't available in most base backups.
                                 */
                                record = ReadCheckpointRecord(xlogreader, checkPointLoc, 1, false);
                                if (record != NULL)
                                {
                                        fast_promoted = true;

                                        /*
                                         * Insert a special WAL record to mark the end of
                                         * recovery, since we aren't doing a checkpoint. That
                                         * means that the checkpointer process may likely be in
                                         * the middle of a time-smoothed restartpoint and could
                                         * continue to be for minutes after this. That sounds
                                         * strange, but the effect is roughly the same and it
                                         * would be stranger to try to come out of the
                                         * restartpoint and then checkpoint. We request a
                                         * checkpoint later anyway, just for safety.
                                         */
                                        CreateEndOfRecoveryRecord();
                                }
                        }

                        if (!fast_promoted)
                                RequestCheckpoint(CHECKPOINT_END_OF_RECOVERY |
                                                                  CHECKPOINT_IMMEDIATE |
                                                                  CHECKPOINT_WAIT);
                }

如果是这样导致的corosync判断超时的话，建议你用fast promote.

linux ha 里pgsql的promote超时

相关课程

相关电子书

相关实验场景

linux ha 里pgsql的promote超时

相关课程

相关文章

相关电子书

相关实验场景

相关镜像