admin 管理员组文章数量: 887021
2023年12月23日发(作者:java编程都要什么软件)
Oracle 11G数据库DataGuard灾备切换方案
、检查
1、确定MRP进程在正常运行
备库执行如下SQL确定MRP进程正常:
SELECT PROCESS FROM V$MANAGED_STANDBY WHERE PROCESS LIKE 'MRP%';
主库执行如下SQL,确定备库是“REAL TIME八「「1丫”状态
SQL> SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID=2;
RECOVERYMODE
MANAGED REAL TIME APPLY
如果备库没有启用real-time apply,则需要重新将备库启动至real-time apply:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
2、确定有足够的归档进程
在所有的主备库实例上查询参数LOG_ARCHIVE_MAX_PROCESSES,确定其值大于等于4, 但不会太大
3、确定目标备库的REDO为clear状态
虽然在发起SWITCHOVER TO PRIMARY命令时,备库的REDO会自动转换为CLEAR
状态,但依然建议在SWITCHOVER前REDO为CLEAR状态。
确保正确设置了 LOG_FILE_NAME_CONVERT参数。
使用如下SQL在目标备库上查看REDO状态:
SQL> SELECT DISTINCT # FROM V$LOG L, V$LOGFILE LF
WHERE # = #
AND NOT IN (UNUSED’,
、CLEARING’,’CLEARING_CURRENT’);
如果如上的查询有结果,则需要停止备库的REDOAPPLY,并通过如下的SQL来对其进 行
CLEAR
SQL> ALTER DATABASE CLEAR LOGFILE GROUP
4、确定没有大量的GAP
主库执行如下SQL查看主库当前的REDO SEQUENCE
SQL〉
SELECT THREAD#, SEQUENCE# FROM V$THREAD;
在备库上执行如下查询,确定查询出来的结果与上面的结果相比较只差1-2个数值
SELECT THREAD#, MAX(SEQUENCE#) FROM V$ARCHIVED_LOG
WHERE APPLIED = 'YES'
AND RESETLOGS_CHANGE# = (SELECT RESETLOGS_CHANGE#
FROM V$DATABASEINCARNATION WHERE STATUS = yCURRENT’)
GROUP BY THREAD#;
5、确定主库以及目标备库的所有文件都为ONLINE
主备库分别执行如下SQL,查看tempfile是否正常,如果备库上缺失文件则需要进行 处理:
SELECT FILENAME, BYTES, TABLESPACE
FROM V$TEMPFILE TMP, V$TABLESPACE TS WHERE #=#;
在主备库分别执行如下SQL,查看数据文件状态,结果应该一致
SELECT NAME FROM V$DATAFILE WHERE STATUS=’OFFLINE’;
如果备库上有比主库多出的OFFLINE状态的数据文件,则将其ONLINE:
ALTER DATABASE DATAFILE &FILEID ONLINE;
、切换
1、检查主库是否可切换至STANDBY
主库执行如下SQL执行检查
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVERSTATUS
TO STANDBY
如上的SQL查询结果如果为〃TO STANDBY”或者〃SESSIONS ACTIVE〃表示主库可切
换至STANDBY,如果不为这两个值,则说明REDO传输存在问题。
2、停止主库第一个节点以外的所有实例(RAC)
最好使用shutdown normal或者shutdown immediate方式停止数据库。如果使用了
shutdown
abort将其他节点进行了关闭,则需等待RAC reconfig完成,且第一个节点 将其余REDO正常前滚或回滚
3、切换主库至STANDBY角色
将主库切换至STANDBY
ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;
如果遇到ORA-16139报错,且V$DATABASE视图中DATABASE_ROLE字段的值 已为“ PHYSICAL STANDBY”,则可继续(这种问题的出现其中一个可能是 数据库有大量的数据文件)。
4、确定STANDBY收至1」EOR
在主库的ALERT日志中可以看到类似如下的信息:
Switchover: Primary controlfile converted to standby controlfile
succesfully.
Tue Mar 15 16:12:15 2011
MRP0 started with pid=17, OS id=2717
MRP0: Background Managed Standby Recovery process started (SFO)
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
Online logfile pre-clearing operation disabled by switchover Media
Recovery Log
/u01/app/flash_recovery_area/SFO/archivelog/2011_03_15/o1_mf_1_1
33_6qzl0yvd_.arc
Identified End-Of-Redo for thread 1 sequence 133
Resetting standby activation ID 0 (0x0)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Applied until change 4314801
MRP0: Media Recovery Complete: End-Of-REDO (SFO)
MRP0: Background Media Recovery process shutdown (SFO)
Tue Mar 15 16:12:21 2011
Switchover: Complete - Database shutdown required (SFO)
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDB、
WITH SESSION SHUTDOWN
同时在所有备库的ALERT日志中可以看到类似如下的信息:
Tue Mar 15 16:12:15 2011
RFS[8]: Assigned to RFS process 2715
RFS[8]: Identified database type as ,physical standby,: Client is
Foreground pid 2568
Media Recovery Log
/u01/app/flash_recovery_area/NYC/archivelog/2011_03_15/o1_mf_1_1
33_6qzl0yjp_.arc
Identified End-Of-Redo for thread 1 sequence 133
Resetting standby activation ID 2680651518 (0x9fc77efe)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Continuing
Resetting standby activation ID 2680651518 (0x9fc77efe)
Media Recovery Waiting for thread 1 sequence 134
5、检查STANDBY能够切换至PRIMARY
目标备库上执行如下SQL进行检查
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVERSTATUS
TO PRIMARY
如上的SQL查询结果如果为"
PRIMARY”或者"SESSIONS ACTIVE"表示目标备库可 切换至TO
PRIMARY,如果不为这两个值,则说明REDO传输或者应用存在问题。
6、切换备库至PRIMARY
在目标备库执行如下命令
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
同时在alert日志中有类似如下信息
Tue Mar 15 16:16:44 2011
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN
ALTER DATABASE SWITCHOVER TO PRIMARY (NYC)
Maximum wait for role transition is 15 minutes.
Switchover: Media recovery is still active
Role Change: Canceling MRP - no more redo to apply
Tue Mar 15 16:16:45 2011
MRP0: Background Media Recovery cancelled with status 16037 Errors
in file
/u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Waiting for MRP0 pid 2460 to terminate
Errors in file
/u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_:
ORA-16037: user requested cancel of managed recovery operation Tue
Mar 15 16:16:45 2011
MRP0: Background Media Recovery process shutdown (NYC)
Role Change: Canceled MRP
7、打开新的主库
在新的主库上打开数据库
ALTER DATABASE OPEN;
检查新主库的TEMPFILE
8、
如果存在问题则进行处理。
9、
重启新的备库
首先停止新的备库
SHUTDOWN ABORT;
注:如果使用immediate停止数据库,则其依然会使用abort方式停止数据库,会在
alert日志中看到类似如下信息:
Performing implicit shutdown abort due to switchover to physical
standby
Shutting down instance (abort)
License high water mark = 15
USER (ospid: 14665): terminating the instance
Instance terminated by USER, pid = 14665
启动新的备库:
SQL> STARTUP MOUNT;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
10、 意外或回退
参考
Appendix A.4.5 Roll Back After Unsuccessful Switchover and
Start Over
三、无法正常切换的处理 若主数据库异常中断无法连接做switchover处理,需要将灾备环境强制切换为主库(即 failover),需要注意的是,此种切换是将备库强制进行切换,可能会由于主备库之间并未完 全同步导致有数据丢失,需慎重处理。
1、检查备库是否可正常切换至PRIMARY
备库执行如下SQL执行检查
SQL> select database_role,switchover_status from v$database;
DATABASEROLE SWITCHOVERSTATUS
PHYSICAL STANDBY NOT ALLOWED
如上的SQL查询结果如果为"T。PRIMARY”或者"SESSIONS ACTIVE"表示目标备库可 正常切换至PRIMARY,如果不为这两个值,则说明REDO传输或者应用存在问题,则 需要执行强制切换。
2、关闭备库的MRP进程 在目标备库执行如下命令
ALTER DATABASE RECOVER MANAGED STADNBY DATABASE FINISH;
同时在alert日志中有类似如下信息
ALTER DATABASE RECOVER managed standby database finish
Terminal Recovery: request posted (DMPDB) Wed Mar 04 21:34:34 2015
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Terminal Recovery timestamp is 503/04/2015 21:34:34'
Terminal Recovery: applying standby redo logs.
Terminal Recovery: thread 1 seq# 34 redo required Media Recovery
Waiting for thread 1 sequence 34 Terminal Recovery: End-Of-Redo
log allocation Terminal Recovery: standby redo logfile 4 created
'/archivelog/dmpdb/arch_1_0_'
This standby redo logfile is being created as part of the failover
operation. This standby redo logfile should be deleted after the
switchover to primary operation completes. Media Recovery Log
/archivelog/dmpdb/arch_1_0_ Terminal Recovery: log 4
reserved for thread 1 sequence 34 Recovery of Online Redo Log:
Thread 1 Group 4 Seq 34 Reading mem 0 Mem# 0:
/archivelog/dmpdb/arch_1_0_
Identified End-Of-Redo (failover) for thread 1 sequence 34 at SCN
ff
Incomplete Recovery applied until change 1234252 time 03/04/2015
21:23:43
MRP0: Media Recovery Complete (DMPDB)
Terminal Recovery: successful completion
Wed Mar 04 21:34:35 2015
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance DMPDB - Archival Error
ORA-16014: log 4 sequence# 34 not archived, no available
destinations
ORA-00312: online log 4 thread 1:
'/archivelog/dmpdb/arch_1_0_' Forcing ARSCN to IRSCN
for TR 0:1234252 Attempt to set limbo arscn 0:1234252 irscn
0:1234252 Resetting standby activation ID 2865247982 (0xaac836ee)
MRP0: Background Media Recovery process shutdown (DMPDB) Terminal
Recovery: completion detected (DMPDB)
Completed: ALTER DATABASE RECOVER MANAGED STADNBY DATABASE FINISH
3、切换数据库到Primary
执行如下SQL检查备库的状态
SQL> select database_role,switchover_status from v$database;
DATABASE_ROLE SWITCHOVER_STATUS ----------------
PHYSICAL STANDBY TO PRIMARY
如果结果为“TO PRIMARY”则说明已经可以切换到主库,继续如下操作。
将备库切换为主库
SQL> alter database commit to switchover to primary; Database altered.
打开新的主库
SQL> alter database open;
Database altered.
alert日志中看到类似如下信息:
alter database commit to switchover to
primary ALTER DATABASE SWITCHOVER TO PRIMARY (DMPDB) Maximum wait
for role transition is 15 minutes. All dispatchers and shared
servers shutdown CLOSE: killing server sessions.
CLOSE: all sessions shutdown successfully.
Wed Mar 04 21:35:47 2015
SMON: disabling cache recovery
Backup controlfile written to trace file
/u01/app/oracle/diag/rdbms/DMPDB/DMPDB/trace/DMPDB_ora_
Standby terminal recovery start SCN: 1234251
RESETLOGS after incomplete recovery UNTIL CHANGE 1234252
Online log /oradata/dmpdb/: Thread 1 Group 1 was
previously cleared
Online log /oradata/dmpdb/: Thread 1 Group 2 was
previously cleared
Online log /oradata/dmpdb/: Thread 1 Group 3 was
previously cleared
Standby became primary SCN: 1234250
Wed Mar 04 21:35:47 2015
Setting recovery target incarnation to 3
AUDIT_TRAIL initialization parameter is changed back to its
original value as specified in the parameter file.
Switchover: Complete - Database mounted as primary
Completed: alter database commit to switchover to primary
4、检查新主库的TEMPFILE
如果存在问题则进行处理。
5、待原主库环境修复后重做DataGuard
由于做了
failover的强制切换,破坏了原有的DG环境,需要将原主库环境进行重做,
即,将新的主库同步到原主库环境进行DataGuard的重做。
版权声明:本文标题:Oracle11G数据库DataGuard灾备切换方案 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.freenas.com.cn/jishu/1703297963h445951.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论