admin 管理员组

文章数量: 887021


2023年12月23日发(作者:java编程都要什么软件)

Oracle 11G数据库DataGuard灾备切换方案

、检查

1、确定MRP进程在正常运行

备库执行如下SQL确定MRP进程正常:

SELECT PROCESS FROM V$MANAGED_STANDBY WHERE PROCESS LIKE 'MRP%';

主库执行如下SQL,确定备库是“REAL TIME八「「1丫”状态

SQL> SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID=2;

RECOVERYMODE

MANAGED REAL TIME APPLY

如果备库没有启用real-time apply,则需要重新将备库启动至real-time apply:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

2、确定有足够的归档进程

在所有的主备库实例上查询参数LOG_ARCHIVE_MAX_PROCESSES,确定其值大于等于4, 但不会太大

3、确定目标备库的REDO为clear状态

虽然在发起SWITCHOVER TO PRIMARY命令时,备库的REDO会自动转换为CLEAR

状态,但依然建议在SWITCHOVER前REDO为CLEAR状态。

确保正确设置了 LOG_FILE_NAME_CONVERT参数。

使用如下SQL在目标备库上查看REDO状态:

SQL> SELECT DISTINCT # FROM V$LOG L, V$LOGFILE LF

WHERE # = #

AND NOT IN (UNUSED’,

、CLEARING’,’CLEARING_CURRENT’);

如果如上的查询有结果,则需要停止备库的REDOAPPLY,并通过如下的SQL来对其进 行

CLEAR

SQL> ALTER DATABASE CLEAR LOGFILE GROUP ;

4、确定没有大量的GAP

主库执行如下SQL查看主库当前的REDO SEQUENCE

SQL〉

SELECT THREAD#, SEQUENCE# FROM V$THREAD;

在备库上执行如下查询,确定查询出来的结果与上面的结果相比较只差1-2个数值

SELECT THREAD#, MAX(SEQUENCE#) FROM V$ARCHIVED_LOG

WHERE APPLIED = 'YES'

AND RESETLOGS_CHANGE# = (SELECT RESETLOGS_CHANGE#

FROM V$DATABASEINCARNATION WHERE STATUS = yCURRENT’)

GROUP BY THREAD#;

5、确定主库以及目标备库的所有文件都为ONLINE

主备库分别执行如下SQL,查看tempfile是否正常,如果备库上缺失文件则需要进行 处理:

SELECT FILENAME, BYTES, TABLESPACE

FROM V$TEMPFILE TMP, V$TABLESPACE TS WHERE #=#;

在主备库分别执行如下SQL,查看数据文件状态,结果应该一致

SELECT NAME FROM V$DATAFILE WHERE STATUS=’OFFLINE’;

如果备库上有比主库多出的OFFLINE状态的数据文件,则将其ONLINE:

ALTER DATABASE DATAFILE &FILEID ONLINE;

、切换

1、检查主库是否可切换至STANDBY

主库执行如下SQL执行检查

SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;

SWITCHOVERSTATUS

TO STANDBY

如上的SQL查询结果如果为〃TO STANDBY”或者〃SESSIONS ACTIVE〃表示主库可切

换至STANDBY,如果不为这两个值,则说明REDO传输存在问题。

2、停止主库第一个节点以外的所有实例(RAC)

最好使用shutdown normal或者shutdown immediate方式停止数据库。如果使用了

shutdown

abort将其他节点进行了关闭,则需等待RAC reconfig完成,且第一个节点 将其余REDO正常前滚或回滚

3、切换主库至STANDBY角色

将主库切换至STANDBY

ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;

如果遇到ORA-16139报错,且V$DATABASE视图中DATABASE_ROLE字段的值 已为“ PHYSICAL STANDBY”,则可继续(这种问题的出现其中一个可能是 数据库有大量的数据文件)。

4、确定STANDBY收至1」EOR

在主库的ALERT日志中可以看到类似如下的信息:

Switchover: Primary controlfile converted to standby controlfile

succesfully.

Tue Mar 15 16:12:15 2011

MRP0 started with pid=17, OS id=2717

MRP0: Background Managed Standby Recovery process started (SFO)

Serial Media Recovery started

Managed Standby Recovery not using Real Time Apply

Online logfile pre-clearing operation disabled by switchover Media

Recovery Log

/u01/app/flash_recovery_area/SFO/archivelog/2011_03_15/o1_mf_1_1

33_6qzl0yvd_.arc

Identified End-Of-Redo for thread 1 sequence 133

Resetting standby activation ID 0 (0x0)

Media Recovery End-Of-Redo indicator encountered

Media Recovery Applied until change 4314801

MRP0: Media Recovery Complete: End-Of-REDO (SFO)

MRP0: Background Media Recovery process shutdown (SFO)

Tue Mar 15 16:12:21 2011

Switchover: Complete - Database shutdown required (SFO)

Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDB、

WITH SESSION SHUTDOWN

同时在所有备库的ALERT日志中可以看到类似如下的信息:

Tue Mar 15 16:12:15 2011

RFS[8]: Assigned to RFS process 2715

RFS[8]: Identified database type as ,physical standby,: Client is

Foreground pid 2568

Media Recovery Log

/u01/app/flash_recovery_area/NYC/archivelog/2011_03_15/o1_mf_1_1

33_6qzl0yjp_.arc

Identified End-Of-Redo for thread 1 sequence 133

Resetting standby activation ID 2680651518 (0x9fc77efe)

Media Recovery End-Of-Redo indicator encountered

Media Recovery Continuing

Resetting standby activation ID 2680651518 (0x9fc77efe)

Media Recovery Waiting for thread 1 sequence 134

5、检查STANDBY能够切换至PRIMARY

目标备库上执行如下SQL进行检查

SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;

SWITCHOVERSTATUS

TO PRIMARY

如上的SQL查询结果如果为"

PRIMARY”或者"SESSIONS ACTIVE"表示目标备库可 切换至TO

PRIMARY,如果不为这两个值,则说明REDO传输或者应用存在问题。

6、切换备库至PRIMARY

在目标备库执行如下命令

ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;

同时在alert日志中有类似如下信息

Tue Mar 15 16:16:44 2011

ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN

ALTER DATABASE SWITCHOVER TO PRIMARY (NYC)

Maximum wait for role transition is 15 minutes.

Switchover: Media recovery is still active

Role Change: Canceling MRP - no more redo to apply

Tue Mar 15 16:16:45 2011

MRP0: Background Media Recovery cancelled with status 16037 Errors

in file

/u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_:

ORA-16037: user requested cancel of managed recovery operation

Managed Standby Recovery not using Real Time Apply

Recovery interrupted!

Waiting for MRP0 pid 2460 to terminate

Errors in file

/u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_:

ORA-16037: user requested cancel of managed recovery operation Tue

Mar 15 16:16:45 2011

MRP0: Background Media Recovery process shutdown (NYC)

Role Change: Canceled MRP

7、打开新的主库

在新的主库上打开数据库

ALTER DATABASE OPEN;

检查新主库的TEMPFILE

8、

如果存在问题则进行处理。

9、

重启新的备库

首先停止新的备库

SHUTDOWN ABORT;

注:如果使用immediate停止数据库,则其依然会使用abort方式停止数据库,会在

alert日志中看到类似如下信息:

Performing implicit shutdown abort due to switchover to physical

standby

Shutting down instance (abort)

License high water mark = 15

USER (ospid: 14665): terminating the instance

Instance terminated by USER, pid = 14665

启动新的备库:

SQL> STARTUP MOUNT;

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

10、 意外或回退

参考

Appendix A.4.5 Roll Back After Unsuccessful Switchover and

Start Over

三、无法正常切换的处理 若主数据库异常中断无法连接做switchover处理,需要将灾备环境强制切换为主库(即 failover),需要注意的是,此种切换是将备库强制进行切换,可能会由于主备库之间并未完 全同步导致有数据丢失,需慎重处理。

1、检查备库是否可正常切换至PRIMARY

备库执行如下SQL执行检查

SQL> select database_role,switchover_status from v$database;

DATABASEROLE SWITCHOVERSTATUS

PHYSICAL STANDBY NOT ALLOWED

如上的SQL查询结果如果为"T。PRIMARY”或者"SESSIONS ACTIVE"表示目标备库可 正常切换至PRIMARY,如果不为这两个值,则说明REDO传输或者应用存在问题,则 需要执行强制切换。

2、关闭备库的MRP进程 在目标备库执行如下命令

ALTER DATABASE RECOVER MANAGED STADNBY DATABASE FINISH;

同时在alert日志中有类似如下信息

ALTER DATABASE RECOVER managed standby database finish

Terminal Recovery: request posted (DMPDB) Wed Mar 04 21:34:34 2015

Begin: Standby Redo Logfile archival

End: Standby Redo Logfile archival

Terminal Recovery timestamp is 503/04/2015 21:34:34'

Terminal Recovery: applying standby redo logs.

Terminal Recovery: thread 1 seq# 34 redo required Media Recovery

Waiting for thread 1 sequence 34 Terminal Recovery: End-Of-Redo

log allocation Terminal Recovery: standby redo logfile 4 created

'/archivelog/dmpdb/arch_1_0_'

This standby redo logfile is being created as part of the failover

operation. This standby redo logfile should be deleted after the

switchover to primary operation completes. Media Recovery Log

/archivelog/dmpdb/arch_1_0_ Terminal Recovery: log 4

reserved for thread 1 sequence 34 Recovery of Online Redo Log:

Thread 1 Group 4 Seq 34 Reading mem 0 Mem# 0:

/archivelog/dmpdb/arch_1_0_

Identified End-Of-Redo (failover) for thread 1 sequence 34 at SCN

ff

Incomplete Recovery applied until change 1234252 time 03/04/2015

21:23:43

MRP0: Media Recovery Complete (DMPDB)

Terminal Recovery: successful completion

Wed Mar 04 21:34:35 2015

ARCH: Archival stopped, error occurred. Will continue retrying

ORACLE Instance DMPDB - Archival Error

ORA-16014: log 4 sequence# 34 not archived, no available

destinations

ORA-00312: online log 4 thread 1:

'/archivelog/dmpdb/arch_1_0_' Forcing ARSCN to IRSCN

for TR 0:1234252 Attempt to set limbo arscn 0:1234252 irscn

0:1234252 Resetting standby activation ID 2865247982 (0xaac836ee)

MRP0: Background Media Recovery process shutdown (DMPDB) Terminal

Recovery: completion detected (DMPDB)

Completed: ALTER DATABASE RECOVER MANAGED STADNBY DATABASE FINISH

3、切换数据库到Primary

执行如下SQL检查备库的状态

SQL> select database_role,switchover_status from v$database;

DATABASE_ROLE SWITCHOVER_STATUS ----------------

PHYSICAL STANDBY TO PRIMARY

如果结果为“TO PRIMARY”则说明已经可以切换到主库,继续如下操作。

将备库切换为主库

SQL> alter database commit to switchover to primary; Database altered.

打开新的主库

SQL> alter database open;

Database altered.

alert日志中看到类似如下信息:

alter database commit to switchover to

primary ALTER DATABASE SWITCHOVER TO PRIMARY (DMPDB) Maximum wait

for role transition is 15 minutes. All dispatchers and shared

servers shutdown CLOSE: killing server sessions.

CLOSE: all sessions shutdown successfully.

Wed Mar 04 21:35:47 2015

SMON: disabling cache recovery

Backup controlfile written to trace file

/u01/app/oracle/diag/rdbms/DMPDB/DMPDB/trace/DMPDB_ora_

Standby terminal recovery start SCN: 1234251

RESETLOGS after incomplete recovery UNTIL CHANGE 1234252

Online log /oradata/dmpdb/: Thread 1 Group 1 was

previously cleared

Online log /oradata/dmpdb/: Thread 1 Group 2 was

previously cleared

Online log /oradata/dmpdb/: Thread 1 Group 3 was

previously cleared

Standby became primary SCN: 1234250

Wed Mar 04 21:35:47 2015

Setting recovery target incarnation to 3

AUDIT_TRAIL initialization parameter is changed back to its

original value as specified in the parameter file.

Switchover: Complete - Database mounted as primary

Completed: alter database commit to switchover to primary

4、检查新主库的TEMPFILE

如果存在问题则进行处理。

5、待原主库环境修复后重做DataGuard

由于做了

failover的强制切换,破坏了原有的DG环境,需要将原主库环境进行重做,

即,将新的主库同步到原主库环境进行DataGuard的重做。


本文标签: 备库 切换 执行 主库 确定