EN 联系我们加入我们
典型案例
您现在的位置:首页 > 典型案例
【案例分享】Oracle数据库紧急故障分析处理报告



一、环境说明


OS:REDHAT 6.9

数据库:ORACLE 11.2.0.4



二、故障描述



客户机房非正常断电,待恢复供电后,数据库无法正常启动,报错如下:


ALTER DATABASE   MOUNT

Successful mount of redo thread 1, with mount id 4242209003

Database mounted in Exclusive Mode

Lost write protection disabled

Completed: ALTER DATABASE   MOUNT

ALTER DATABASE OPEN

Errors in file D:\ORACLE\diag\rdbms\orcl\orcl\trace\orcl_ora_4568.trc:

ORA-01122: 数据库文件 1 验证失败

ORA-01110: 数据文件 1: 'D:\ORACLE\ORADATA\orcl\SYSTEM01.DBF'

ORA-01207: 文件比控制文件更新 - 旧的控制文件

ORA-1122 signalled during: ALTER DATABASE OPEN...

Errors in file D:\ORACLE\diag\rdbms\orcl\orcl\trace\orcl_m000_4732.trc:

ORA-00338: 日志 1 (用于线程 1) 比控制文件更新

ORA-00312: 联机日志 1 线程 1: 'D:\ORACLE\ORADATA\orcl\REDO01.LOG'

Errors in file D:\ORACLE\diag\rdbms\orcl\orcl\trace\orcl_m000_4732.trc:

ORA-00338: 日志 1 (用于线程 1) 比控制文件更新

ORA-00312: 联机日志 1 线程 1: 'D:\ORACLE\ORADATA\orcl\REDO01.LOG'

Errors in file D:\ORACLE\diag\rdbms\orcl\orcl\trace\orcl_m000_4732.trc:

ORA-00338: 日志 3 (用于线程 1) 比控制文件更新

ORA-00312: 联机日志 3 线程 1: 'D:\ORACLE\ORADATA\orcl\REDO03.LOG'

Errors in file D:\ORACLE\diag\rdbms\orcl\orcl\trace\orcl_m000_4732.trc:

ORA-00338: 日志 3 (用于线程 1) 比控制文件更新

ORA-00312: 联机日志 3 线程 1: 'D:\ORACLE\ORADATA\orcl\REDO03.LOG'

Checker run found 1 new persistent data failures


三、故障分析


查看报错信息发现,当前redo日志损坏;工程师尝试人工指定redo应用,报ORA-01113错误,此时可确认redo已损坏:

sqlplus / as sysdba

SQL> Recover   database using backup controlfile until cancel

 

SQL> alter database open;

alter database open

*

ERROR at line 1:

ORA-01113: file 1 needs media recovery

ORA-01110: data file 1: 'D:\ORCL\SYSTEM01.DBF'

 

SQL> recover database ;

ORA-00283: recovery session canceled due to errors

ORA-16433: The database must be opened in read/write mode.

 

四、故障处理


1、对pfile文件增加如下参数后继续恢复(增加参数可跳过redo错误),重建redo:

*._ALLOW_RESETLOGS_CORRUPTION=true

*.undo_management=manual

 

2、重建控制文件尝试恢复:

CREATE CONTROLFILE REUSE DATABASE "orcl" RESETLOGS  NOARCHIVELOG

MAXLOGFILES 16

MAXLOGMEMBERS 3

MAXDATAFILES 100

MAXINSTANCES 8

MAXLOGHISTORY 292

LOGFILE

GROUP 1 'D:\ORACLE\ORADATA\orcl\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,

GROUP 2 'D:\ORACLE\ORADATA\orcl\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,

GROUP 3 'D:\ORACLE\ORADATA\orcl\REDO03.LOG'  SIZE 50M BLOCKSIZE 512

DATAFILE

'D:\ORACLE\ORADATA\orcl\SYSTEM01.DBF',

'D:\ORACLE\ORADATA\orcl\SYSAUX01.DBF',

'D:\ORACLE\ORADATA\orcl\UNDOTBS01.DBF',

'D:\ORACLE\ORADATA\orcl\USERS01.DBF',

'D:\ORACLE\ORADATA\orcl\TYITSM01.DBF',

'D:\ORACLE\ORADATA\orcl\TYITSM02.DBF'

CHARACTER SET ZHS16GBK;

 

继续打开数据库:

SQL> alter database open resetlogs;

 

此时发现无法正常open数据库,报错如下:


QMNC started with pid=36, OS id=12985

LOGSTDBY: Validating controlfile with logical   metadata

LOGSTDBY: Validation complete

Completed: alter database open

Starting background process CJQ0

CJQ0 started with pid=41, OS id=12999

Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_m000_12989.trc  (incident=365101):

ORA-00600: internal error code, arguments:   [2662], [0], [193646286], [0], [193662077], [50331650], [], [], [], [], [],   []

Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_m002_12993.trc  (incident=365117):

ORA-00600: internal error code, arguments:   [2662], [0], [193646286], [0], [193662077], [50331650], [], [], [], [], [],   []

Incident details in:   /home/app/diag/rdbms/orcl/orcl/incident/incdir_365101/orcl_m000_12989_i365101.trc

Incident details in:   /home/app/diag/rdbms/orcl/orcl/incident/incdir_365117/orcl_m002_12993_i365117.trc

Use ADRCI or Support Workbench to package the   incident.

See Note 411.1 at My Oracle Support for error   and packaging details.

Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_m000_12989.trc  (incident=365102):

ORA-00600: internal error code, arguments:   [2662], [0], [193646286], [0], [193662077], [50331650], [], [], [], [], [],   []

Incident details in:   /home/app/diag/rdbms/orcl/orcl/incident/incdir_365102/orcl_m000_12989_i365102.trc

Use ADRCI or Support Workbench to package the   incident.

See Note 411.1 at My Oracle Support for error   and packaging details.

Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_m002_12993.trc  (incident=365118):

ORA-00600: internal error code, arguments:   [2662], [0], [193646286], [0], [193662077], [50331650], [], [], [], [], [],   []

Incident details in: /home/app/diag/rdbms/orcl/orcl/incident/incdir_365118/orcl_m002_12993_i365118.trc



错误说明

2662错误的说明:
ORA-600 [2662] [a] [b] [c] [d] [e]
Arg [a] Current SCN WRAP
Arg [b] Current SCN BASE
Arg [c] dependent SCN WRAP
Arg [d] dependent SCN BASE
Arg [e] Where present this is the DBA where the dependent SCN came from


2662错误一般通过增加SCN值为期望值来处理。

增加SCN的方法很多,比如常用的:oradebug,event,parameter等方法。
下文选择oradebug的方式来增加SCN,此方法通用,win/linux/unix平台均可使用。

SQL> oradebug setmypid
Statement processed.
SQL> oradebug dumpvar sga kcsgscn_
kcslf kcsgscn_ [0XXXXXXX0, 06001AEA0) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000
SQL> oradebug poke 0xXXXXXXXX0 4 772867878;
BEFORE: [0XXXXXXX0, 0XXXXXXX4) = 00000000
AFTER: [0XXXXXXX0, 0XXXXXXX4) = 2E110726

SQL> alter database open resetlogs;
Database altered.

SQL> select open_mode from v$database;
OPEN_MODE
—————————————-
READ WRITE

 

如果open之后出现ora-600 4194/4193错误,可重建undo解决,最后进行数据库导出。



五、经验总结

在本次故障处理中,建议数据库管理员在生产库中一定要开启归档,并及时做好备份,保证数据恢复的及时有效。



如欲了解更多,请登录安图特官方网站:www.antute.com.cn

版权所有 安图特(北京)科技有限公司 备案号:京ICP备17074963号-1
技术支持:创世网络