Groups – OSDBA OSOPER OSASM

This one is quick and short. I have been asked few time, how can we check what value was specified to OSDBA, OSOPER and OSASM,especially during upgrades (to know what was it set to which initial installation)

# The DBA_GROUP is the OS group which is to be granted OSDBA privileges.
# The OPER_GROUP is the OS group which is to be granted OSOPER privileges.
# The OSASM_GROUP is the OS group which is to be granted OSASM privileges.

The below is from GRID_HOME :-

[grid@testdb1 dbs]$ grep “define SS_” $ORACLE_HOME/rdbms/lib/config.c
#define SS_DBA_GRP “asmdba”
#define SS_OPER_GRP “”
#define SS_ASM_GRP “asmadmin”

Even installation logfiles can be checked.

Silent Upgrade Oracle GoldenGate 12.1.2.0.0 to 12.1.2.1.2

Today I worked on silent upgrade of Oracle GoldenGate from 12.1.2.0.0 to 12.1.2.1.2, so thought to blog it which might help some of you. These are the steps I performed

Current GoldenGate Home –> /oracle/app/product/ogg12.1.2

1. Gather the details of GoldenGate Processes before stopping them

GGSCI> INFO EXTRACT EXTL, SHOWCH
INFO EXTRACT DPUMP, SHOWCH
GGSCI> SEND EXTRACT EXTL, SHOWTRANS
GGSCI> STOP EXTRACT DPUMP
GGSCI> STOP EXTRACT EXTL
GGSCI> SEND REPLICAT REPL STATUS
GGSCI> STOP REPLICAT REPL
GGSCI> STOP JAGENT
GGSCI> STOP MANAGER

2. Backup the existing binaries and associated files needed for your environment.

3. Edit the response file and run the runInstaller in silent mode

In the response file

#-------------------------------------------------------------------------------
# Specify the installation option.
# Specify ORA12c for installing Oracle GoldenGate for Oracle Database 12c and
#         ORA11g for installing Oracle GoldenGate for Oracle Database 11g
#-------------------------------------------------------------------------------
INSTALL_OPTION=ORA11g

#-------------------------------------------------------------------------------
# Specify a location to install Oracle GoldenGate
#-------------------------------------------------------------------------------
SOFTWARE_LOCATION=/oracle2/app/product/ogg12.1.2

#-------------------------------------------------------------------------------
# Specify true to start the manager after installation.
#-------------------------------------------------------------------------------
START_MANAGER=false

#-------------------------------------------------------------------------------
# Specify a free port within the valid range for the manager process.
# Required only if START_MANAGER is true.
#-------------------------------------------------------------------------------
MANAGER_PORT=

#-------------------------------------------------------------------------------
# Specify the location of the Oracle Database.
# Required only if START_MANAGER is true.
#-------------------------------------------------------------------------------
DATABASE_LOCATION=

#-------------------------------------------------------------------------------
# Specify the location which holds the install inventory files.
# This is an optional parameter if installing on
# Windows based Operating System.
#-------------------------------------------------------------------------------
INVENTORY_LOCATION=/oracle/oraInventory

#-------------------------------------------------------------------------------
# Unix group to be set for the inventory directory.
# This parameter is not applicable if installing on
# Windows based Operating System.
#-------------------------------------------------------------------------------
UNIX_GROUP_NAME=oinstall

Run the runInstaller in Silent mode

[oracle@oracle gg]$ export ORACLE_HOME=/oracle/app/product/ogg12.1.2
[oracle@oracle gg]$ export PATH=$ORACLE_HOME/bin:$ORACLE_HOME/OPatch:$PATH
[oracle@oracle gg]$ export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
[oracle@oracle Disk1]$ ./runInstaller -silent -showProgress -waitforcompletion -responseFile /opt/oracle/ogg_upg.rsp
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB.   Actual 181836 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 11999 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2015-04-21_12-23-03AM. Please wait ...You can find the log of this install session at:
 /oracle/oraInventory/logs/installActions2015-04-21_12-23-03AM.log

Prepare in progress.
..................................................   10% Done.

Prepare successful.

Copy files in progress.
..................................................   44% Done.
..................................................   50% Done.
..................................................   60% Done.
..................................................   65% Done.
..................................................   72% Done.
....................
Copy files successful.

Link binaries in progress.
..........
Link binaries successful.
..................................................   82% Done.

Setup files in progress.
..................................................   100% Done.

Setup files successful.
The installation of Oracle GoldenGate Core was successful.
Please check '/oracle/oraInventory/logs/silentInstall2015-04-21_12-23-03AM.log' for more details.
Successfully Setup Software.
[oracle@oracle1 Disk1]$

4. Confirm GoldenGate has been upgraded

[oracle@oracle Disk1]$ cd $ORACLE_HOME
[oracle@oracle ogg12.1.2]$
[oracle@oracle ogg12.1.2]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 12.1.2.1.0 OGGCORE_12.1.2.1.0_PLATFORMS_140727.2135.1_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Aug  7 2014 09:14:25
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2014, Oracle and/or its affiliates. All rights reserved.

5. Apply Patch 20265694: Oracle GoldenGate V12.1.2.1.2 for Oracle 11G

[oracle@oracle1 20265694]$ opatch apply
Invoking OPatch 11.2.0.1.7

Oracle Interim Patch Installer version 11.2.0.1.7
Copyright (c) 2011, Oracle Corporation.  All rights reserved.


Oracle Home       : /oracle/app/product/ogg12.1.2
Central Inventory : /oracle/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.7
OUI version       : 11.2.0.3.0
Log file location : /oracle/app/product/ogg12.1.2/cfgtoollogs/opatch/opatch2015-04-21_00-24-15AM.log

Applying interim patch '20265694' to OH '/oracle/app/product/ogg12.1.2'
Verifying environment and performing prerequisite checks...

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/oracle/app/product/ogg12.1.2')


Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...

Patching component oracle.oggcore.ora11g, 12.1.2.1.0...
Patch 20265694 successfully applied
Log file location: /oracle/app/product/ogg12.1.2/cfgtoollogs/opatch/opatch2015-04-21_00-24-15AM.log

OPatch succeeded.
[oracle@oracle1 20265694]$
[oracle@oracle1 20265694]$ opatch lsinv
Invoking OPatch 11.2.0.1.7

Oracle Interim Patch Installer version 11.2.0.1.7
Copyright (c) 2011, Oracle Corporation.  All rights reserved.
........................
........................
Patch  20265694     : applied on Tue Apr 21 00:24:27 EDT 2015
Unique Patch ID:  18593256
   Created on 3 Feb 2015, 15:03:21 hrs PST8PDT
   Bugs fixed:
     19818362, 19602692, 19241234, 17423191, 19781984, 19535319, 19724915
     19721652, 19516537, 19441114, 19132627, 19889991, 19681035

6. Confirm the GoldenGate version is now 12.1.2.1.2

[oracle@oracle ogg12.1.2]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 12.1.2.1.2 20133048 OGGCORE_12.1.2.1.0OGGBP_PLATFORMS_141228.0533_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Dec 28 2014 13:19:44
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2015, Oracle and/or its affiliates. All rights reserved.

7. Start the GoldenGate Processes


GGSCI (oracle.test.com) 1> dblogin userid ggate, Password "xxxxxxxx"
Successfully logged into database.

GGSCI> START MANAGER
GGSCI> START JAGENT
GGSCI> START EXTRACT EXTL
GGSCI> START EXTRACT DPUMP
GGSCI> START REPLICAT REPL

Primary on FileSystem and Standby on ASM

For one of the client, standby server went down. We had another standby server which was kept down for more than a month. Decision was taken to start the server and apply incremental SCN based backup on the standby database.

The standby was on ASM and the Primary on filesystem.Incremental backup was started from the SCN reported by below query

select min(fhscn) from x$kcvfh;

Once the backup completed, it was transferred to standby, standby was mounted (using the old controlfile), backups were cataloged and recovery performed using ‘recover database noredo’.

The recovery was going on, and was handed over to me. After the recovery completed I restored the latest controlfile from Primary and mounted the standby. At this point the controlfile was with the information of the filesystem as they were on the Primary side. The next step was to register everything we had on Standby side:

[oracle@oracle3:~ (db)]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Thu Jan 8 04:57:01 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ABCD (DBID=1463580380, not open)

RMAN> catalog  start with '+DATA/adbc_oracle3/DATAFILE/';

searching for all files that match the pattern +DATA/adbc_oracle3/DATAFILE/

List of Files Unknown to the Database
=====================================
File Name: +data/adbc_oracle3/DATAFILE/TEST.256.844395985
File Name: +data/adbc_oracle3/DATAFILE/TEST.257.844397067
.................
.................
File Name: +data/adbc_oracle3/DATAFILE/TEST.416.865953683

Do you really want to catalog the above files (enter YES or NO)? YES
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: +data/adbc_oracle3/DATAFILE/TEST.256.844395985
File Name: +data/adbc_oracle3/DATAFILE/TEST.257.844397067
................
...............
File Name: +data/adbc_oracle3/DATAFILE/TEST.416.865953683

RMAN> report schema;

RMAN-06139: WARNING: control file is not current for REPORT SCHEMA
Report of database schema for database with db_unique_name ABCD_ORACLE3

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    0        SYSTEM               ***     /oradata/abcd_oracle2/datafile/o1_mf_system_825xkscr_.dbf
2    0        SYSAUX               ***     /oradata/abcd_oracle2/datafile/o1_mf_sysaux_825y451r_.dbf
3    0        TEST                 ***     /oradata/abcd_oracle2/datafile/o1_mf_test_825s84mw_.dbf
4    0        TEST                 ***     /oradata2/abcd_oracle2/datafile/o1_mf_test_8dr1v332_.dbf
................
................
................
147  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_b8k8hcrh_.dbf
148  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_b8k8hdhf_.dbf
149  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_b8k8hf6o_.dbf
150  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_b8k8hg1j_.dbf
151  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_bb318bhs_.dbf
152  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_bb318cff_.dbf
153  0        TEST_INDEX             ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_index_bb318pmy_.dbf
154  0        TEST_NOLOGGING         ***     /oradata3/abcd_oracle2/datafile/o1_mf_test_nolog_bbm2s7vk_.dbf
155  0        TESTINDEX             ***     /oradata3/abcd_oracle2/datafile/o1_mf_test_index_bbm2z7nv_.dbf
156  0        PERFSTAT             ***     /oradata3/abcd_oracle2/datafile/o1_mf_perfstat_bbm312pf_.dbf

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
3    15104    TEMP                 32767       /oradata4/abcd_oracle2/datafile/o1_mf_temp_b633ppbr_.tmp
4    25600    TEMP                 32767       /oradata4/abcd_oracle2/datafile/o1_mf_temp_b633ppcf_.tmp

After catalog completed, it was time to switch database to copy and it failed with below error

RMAN> switch database to copy;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of switch to copy command at 01/08/2015 05:00:25
RMAN-06571: datafile 151 does not have recoverable copy

After some analysis found, it was due to missing datafile on standby. The datafile was created on Primary after the standby was down and the recovery using the incremental backup was done with the older controlfile which had no information about the new datafiles.

Datafile # 151 – 156 were missing, so cataloged the backup pieces again as the controlfile was restored, and started restoring the datafile

RMAN> restore datafile 151;

Starting restore at 08-JAN-15
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00151 to /oradata4/ABCD_ORACLE2/datafile/o1_mf_ct_bb318bhs_.dbf
channel ORA_DISK_1: reading from backup piece +BACKUP/ABCD_ORACLE3/backupset/restore/incr_standby_5sps58rd_1_1

channel ORA_DISK_1: piece handle=+BACKUP/ABCD_ORACLE3/backupset/restore/incr_standby_5sps58rd_1_1 tag=TAG20150107T192825
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:19:05
Finished restore at 08-JAN-15

RMAN>

After the restoration completed, report schema showed

RMAN> report schema;

RMAN-06139: WARNING: control file is not current for REPORT SCHEMA
Report of database schema for database with db_unique_name adbc_oracle3

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    0        SYSTEM               ***     /oradata/ABCD_ORACLE2/datafile/o1_mf_system_825xkscr_.dbf
2    0        SYSAUX               ***     /oradata/ABCD_ORACLE2/datafile/o1_mf_sysaux_825y451r_.dbf
.................
151  30720    TEST                  ***     +DATA/adbc_oracle3/datafile/test.417.868424659 
152  30720    TEST                  ***     +DATA/adbc_oracle3/datafile/test.418.868424659 
................

Tried to perform “switch database to copy” which again failed with the same error “RMAN-06571: datafile 151 does not have recoverable copy” . At this point I though to use “switch datafile to copy” for which generated dynamic sql from primary and ran in to standby.

Generated switch command sql from Primary :-

SQL> select 'switch datafile '||file#||' to copy;' from v$datafile;

[oracle@oracle3:~ (db)]$ vi swtch_copy.rman
[oracle@oracle3:~ (db)]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Thu Jan 8 06:09:46 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ABCD (DBID=1463580380, not open)

RMAN> @swtch_copy.rman

RMAN> switch datafile 2 to copy;
using target database control file instead of recovery catalog
datafile 2 switched to datafile copy "+DATA/adbc_oracle3/datafile/sysaux.277.844418721"

............................
...........................
...........................
RMAN> switch datafile 149 to copy;
datafile 149 switched to datafile copy "+DATA/adbc_oracle3/datafile/ct.415.865953681"

RMAN> switch datafile 156 to copy;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of switch to copy command at 01/08/2015 06:28:54
RMAN-06571: datafile 156 does not have recoverable copy

RMAN>
RMAN> **end-of-file**

Performed ‘recover database noredo’ again and this time it was pretty quick and then tried recover standby database

[oracle@oracle3:~/working/anand (abcd)]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jan 8 06:31:38 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

SQL> recover standby database;
ORA-00279: change 38328244436 generated at 01/07/2015 19:28:32 needed for thread 1
ORA-00289: suggestion : +FRA
ORA-00280: change 38328244436 for thread 1 is in sequence #98501


Specify log: {=suggested | filename | AUTO | CANCEL}
^C
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '+DATA/adbc_oracle3/datafile/system.357.844454911'


Did few changes in DGMGRL parameters and enabled standby configuration. After few hours, standby was insync with Primary. Stopped the MRP , dropped the standby
redo logfiles as they showed filesystem, and created on ASM. Opened the standby in read only mode and started the MRP.

ORA-00600: [ktbdchk1: bad dscn]

Last week I had performed switchover activity of database on version 11.2.0.3 The switchover was performed using dgmgrl “swicthover to standby” command. After sometime we started receiving “ORA-00600: [ktbdchk1: bad dscn]” on the primary database.

Tue Dec 16 10:33:26 2014
Errors in file /ora_software/diag/rdbms/db02_dbv/dbv/trace/db02_ora_16271.trc  (incident=434103):
ORA-00600: internal error code, arguments: [ktbdchk1: bad dscn], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /ora_software/diag/rdbms/db02_dbv/dbv/incident/incdir_434103/db02_ora_16271_i434103.trc

The trace file showed

*** ACTION NAME:() 2014-12-16 10:33:26.857

Dump continued from file: /ora_software/diag/rdbms/db02_dbv/dbv/trace/db02_ora_16271.trc
ORA-00600: internal error code, arguments: [ktbdchk1: bad dscn], [], [], [], [], [], [], [], [], [], [], []

========= Dump for incident 434103 (ORA 600 [ktbdchk1: bad dscn]) ========
----- Beginning of Customized Incident Dump(s) -----
[ktbdchk] -- ktbgcl4 -- bad dscn
dependent scn: 0x0008.f197f24e recent scn: 0x0008.c7313a4c current scn: 0x0008.c7313a4c
----- End of Customized Incident Dump(s) -----
*** 2014-12-16 10:33:26.961
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=dmrpvzvbbsuy5) -----
INSERT INTO MSG_OPEN( SITE_ID, CLIENT_ID, CAMP_ID, MESSAGE_ID, ID, USER_ID, ADDRESS, DATE_OPENED) VALUES( :B7 , :B6 , :B5 , :B4 , LOG_SEQ.NEXTVAL, :B3 , :B2 , :B1 )
----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x32276ee18       611  package body TP.EM_PKG
0x31cbac388         3  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+36        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFFFDA13028 ? 000000001 ?
                                                   000000001 ? 000000002 ?
ksedst1()+98         call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFFFDA13028 ? 000000001 ?
                                                   000000000 ? 000000002 ?
ksedst()+34          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFFFDA13028 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbkedDefDump()+2741  call     ksedst()             000000000 ? 000000001 ?
                                                   7FFFFDA13028 ? 000000001 ?
                                                   000000000 ? 000000002 ?
........................

Searching for the issue on Metalink pointed to the below Document:-

ALERT Description and fix for Bug 8895202: ORA-1555 / ORA-600 [ktbdchk1: bad dscn] ORA-600 [2663] in Physical Standby after switchover (Doc ID 1608167.1)

As per metalink

In a Data Guard environment with Physical Standby (including Active Data Guard), invalid SCNs can be introduced in index blocks after a switchover.

Symptoms ORA-1555 / ORA-600 [ktbdchk1: bad dscn] / ktbGetDependentScn / Dependent scn violations as the block ITL has higher COMMIT SCN than block SCN. DBVERIFY reports the next error when the fix of Bug 7517208 is present; reference Note 7517208.8 for interim patches: itl[] has higher commit scn(aaa.bbb) than block scn (xx.yy) Page failed with check code 6056 There is NO DATA CORRUPTION in the block.

To Resolve

The fix of Bug 8895202 is the workaround.

Although the fix of Bug 8895202 is included in patchset 11.2.0.2 and later, the fix needs to be enabled by setting parameter _ktb_debug_flags = 8.

SQL> alter system set "_ktb_debug_flags"=8 scope=both sid='*';

System altered.

SQL> exit

If you are using Oracle version less than 11.2.0.2, then rebuilding index is the option, as we did for one of the client on 11.1.0.7 version.

One thing to note is —

In rare cases blocks healed by this fix may cause queries to fail with an ORA-600 [ktbgcl1_KTUCLOMINSCN_1] as described in Note 13513004.8 / Bug 13513004.

For more detail Metalink Doc 1608167.1

Upgrading to 11.2.0.4 – Dictionary View Performing Poor

Just a quick blog post on things you might see after upgrading to 11.2.0.4. We recently upgraded database from 11.2.0.3 to 11.2.0.4 and query on some data dictionary views ran too slow.

1. Performace of query on dba_free_space degraded
2. Performance of query involving dba_segments is slow

DEV01> select ceil(sum(b.bytes)/1024/1024) b from sys.dba_free_space b;

Elapsed: 01:31:45.78

Searching MOS pointed to these Doc Ids :-

Insert Statement Based Upon a Query Against DBA_SEGMENTS is Slow After Applying 11.2.0.4 Patchset (Doc ID 1915426.1)

Query Against DBA_FREE_SPACE is Slow After Applying 11.2.0.4 (Doc ID 1904677.1)

DEV01> select ceil(sum(b.bytes)/1024/1024) b from sys.dba_free_space b;

Elapsed: 00:00:01.38

ORA-00600 – [kfnsInstanceReg00]

An interesting case of ORA-600:)

I was paged for standby lag and started looking into the issue. The db version was 11.1.0.7.0 version. Standby was lagging behind by 36 archives

SQL> @mrp

   INST_ID PROCESS   STATUS          THREAD#  SEQUENCE#     BLOCK# DELAY_MINS
---------- --------- ------------ ---------- ---------- ---------- ----------
         1 MRP0      WAIT_FOR_LOG          1     568330          0          0 


   INST_ID PROCESS   STATUS          THREAD#  SEQUENCE#     BLOCK# DELAY_MINS
---------- --------- ------------ ---------- ---------- ---------- ----------
         1 RFS       IDLE                  0          0          0          0
         1 RFS       IDLE                  0          0          0          0
         1 RFS       WRITING               1     568330          1          0 
         1 RFS       IDLE                  0          0          0          0
         1 RFS       IDLE                  1     568366     309529          0

After sometime checked back again and still on the same sequence#. Status of MRP and RFS

SQL> @mrp

   INST_ID PROCESS   STATUS          THREAD#  SEQUENCE#     BLOCK# DELAY_MINS
---------- --------- ------------ ---------- ---------- ---------- ----------
         1 MRP0      WAIT_FOR_LOG          1     568330          0          0


   INST_ID PROCESS   STATUS          THREAD#  SEQUENCE#     BLOCK# DELAY_MINS
---------- --------- ------------ ---------- ---------- ---------- ----------
         1 RFS       IDLE                  0          0          0          0
         1 RFS       IDLE                  0          0          0          0
         1 RFS       WRITING               1     568330          1          0 
         1 RFS       IDLE                  0          0          0          0
         1 RFS       IDLE                  1     568366     795006          0

No error was reported in alert log or was in v$archive_dest_status.Archive log genertaion on Primary was usual like any other day, nothing changed.Seq# 568330 was present on Primary and size was 497Mb whereas the same seq# size on standby was 513Mb. I am not sure why!!!

So, i thought to copy the archive from the primary, stop mrp , perform manual recovery using ‘recover standby standby’ as other 35 archives were present on disk and then start real time apply.

Once the copy completed, i stopped the MRP and started manual recovery

SQL> alter database recover managed standby database cancel;

Database altered.

SQL> select name,open_mode,database_role from v$database;

NAME      OPEN_MODE  DATABASE_ROLE
--------- ---------- ----------------
TEST      MOUNTED    PHYSICAL STANDBY

SQL> recover standby database;
ORA-00279: change 69956910076 generated at 10/26/2014 21:46:36 needed for
thread 1
ORA-00289: suggestion :
/oraclebackup/fra/TEST/archivelog/2014_10_26/o1_mf_1_568330_b4vdl546_.arc
ORA-00280: change 69956910076 for thread 1 is in sequence #568330


Specify log: {=suggested | filename | AUTO | CANCEL}
AUTO
ORA-16145: archival for thread# 1 sequence# 568330 in progress 
SQL> @mrp

no rows selected


   INST_ID PROCESS   STATUS          THREAD#  SEQUENCE#     BLOCK# DELAY_MINS
---------- --------- ------------ ---------- ---------- ---------- ----------
         1 RFS       IDLE                  0          0          0          0
         1 RFS       IDLE                  0          0          0          0
         1 RFS       WRITING               1     568330          1          0 
         1 RFS       IDLE                  0          0          0          0
         1 RFS       IDLE                  1     568372     563688          0

At this point i thought to bounce the standby and then perform recovery. I issued shutdown immediate and it was hung. Almost after 10mins alert log showed

Active call for process 45660 user 'oracle' program 'oracle@test.com'
SHUTDOWN: waiting for active calls to complete.

I checked the PID 45660 and as it showed LOCAL=NO, did kill -9

[oracle@test:~/app/oracle (test)]$ ps -ef | grep 45660
oracle   41463 16050  0 00:43 pts/0    00:00:00 grep 45660
oracle   45660     1  1 Oct23 ?        01:09:22 oracletest (LOCAL=NO)
[oracle@test:~/app/oracle (test)]$ kill -9 45660 
[oracle@test:~/app/oracle (test)] ps -ef | grep 45660
oracle   42181 16050  0 00:44 pts/0    00:00:00 grep 45660
oracle   45660     1  1 Oct23 ?        01:09:22 oracletest (LOCAL=NO)

Kill -9 didn’t kill the process. As it was standby, i thought to go ahead and Abort the instance and this is what happend when i tried startup mount

SQL> startup mount
ORACLE instance started.

Total System Global Area 6.1466E+10 bytes
Fixed Size                  2174600 bytes
Variable Size            2.0535E+10 bytes
Database Buffers         4.0802E+10 bytes
Redo Buffers              126640128 bytes

ORA-03113: end-of-file on communication channel
Process ID: 47331
Session ID: 995 Serial number: 3

Alert log showed

Mon Oct 27 00:48:09 2014
ALTER DATABASE MOUNT
Mon Oct 27 00:53:10 2014
System State dumped to trace file /home/oracle/app/oracle/diag/rdbms/test_TEST_db/test/trace/test_ckpt_47252.trc
Mon Oct 27 00:53:11 2014
Errors in file /home/oracle/app/oracle/diag/rdbms/test_TEST_db/test/trace/test_asmb_47260.trc (incident=491160):
ORA-00600: internal error code, arguments: [600], [ORA_NPI_ERROR], [ORA-00600: internal error code, arguments: [kfnsInstanceReg00], [test:test_TEST_DB], [30000], [], [], [], [], [], [], [], [], []
], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/app/oracle/diag/rdbms/test_TEST_db/test/incident/incdir_491160/test_asmb_47260_i491160.trc
Errors in file /home/oracle/app/oracle/diag/rdbms/test_TEST_db/test/trace/test_asmb_47260.trc:
ORA-15064: communication failure with ASM instance

ORA-00600: internal error code, arguments: [600], [ORA_NPI_ERROR], [ORA-00600: internal error code, arguments: [kfnsInstanceReg00], [test:test_TEST_DB], [30000], [], [], [], [], [], [], [], [], []
], [], [], [], [], [], [], [], [], []
Mon Oct 27 00:53:12 2014
Mon Oct 27 00:53:12 2014
Sweep Incident[491160]: completed
Mon Oct 27 00:53:12 2014
Trace dumping is performing id=[cdmp_20141027005312]
Errors in file /home/oracle/app/oracle/diag/rdbms/test_TEST_db/prod/trace/test_ckpt_47252.trc:
ORA-15083: failed to communicate with ASMB background process
CKPT (ospid: 47252): terminating the instance due to error 15083
Mon Oct 27 00:53:20 2014
Instance terminated by CKPT, pid = 47252

The CKPT trace showed


15083: Timeout waiting for ASMB to be ready (state=1)
ASMB is ALIVE
—– Abridged Call Stack Trace —–

SYSTEM STATE (level=10, with short stacks)
————
System global information:
processes: base 0xe90ea0088, size 900, cleanup 0xea0ec6598
allocation: free sessions 0xeb8f53d68, free calls (nil)
control alloc errors: 0 (process), 0 (session), 0 (call)
PMON latch cleanup depth: 0
seconds since PMON's last scan for dead processes: 1
system statistics:
22 logons cumulative
19 logons current
5 opened cursors cumulative
1 opened cursors current
0 user commits
0 user rollbacks
51 user calls
19 recursive calls
0 recursive cpu usage
0 session logical reads
0 session stored procedure space
0 CPU used when call started
0 CPU used by this session
1 DB time

Looking at the ASM instance alert log

System State dumped to trace file /home/oracle/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_33024.trc
Errors in file /home/oracle/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_33024.trc:

more /home/oracle/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_33024.trc


*** ACTION NAME:() 2014-10-27 02:18:39.197

kfnmsg_wait: KSR failure reaping message (0xffffff83)

*** 2014-10-27 02:18:39.197
Communication failure sending message to UFGs (0xffffff83)
kfnmsg_wait: opcode=35 msg=0x6b3f0928
2014-10-27 02:18:39.197762 :kfnmsg_Dump(): kfnmsg_SendtoUFGs: (kfnmsg) op=35 (KFNMS_GROUP_RLS) (gnum) [1.0]
CommunicationFailure 0xffffff83 after 300s

The server was hosting 2 standby db instances and 1 ASM instance. We thought to bring down the other standby db instance and then the ASM instance and then finally restart all. The shutdown immediate for 2nd standby instance was hung for long time so we aborted it. After ASM was brought down, still we could see below 5 pids

$ ps -ef | grep oracle
oracle 12032 16050 0 02:44 pts/0 00:00:00 grep arc
oracle 23135 1 0 Oct18 ? 00:02:15 ora_arc5_test
oracle 25390 1 0 Oct18 ? 00:01:19 ora_arc1_test2
oracle 25392 1 0 Oct18 ? 00:00:50 ora_arc2_test2

oracle 45660 1 1 Oct23 ? 01:09:22 oracleprod (LOCAL=NO)

Even after all the instances were down on server we could see few archiver Pids and a Pid with LOCAL=NO. Kill -9 also didn’t kill the pid.

Started the ASM instance and then when i started the standby instance TEST , ASM crashed

NOTE: check client alert log.
NOTE: Trace records dumped in trace file /home/oracle/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_57380.trc
Errors in file /home/oracle/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_57380.trc (incident=280159):
ORA-00600: internal error code, arguments: [kfupsRelease01], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/app/oracle/diag/asm/+asm/+ASM/incident/incdir_280159/+ASM_ora_57380_i280159.trc

At this point escalated to client for server reboot. After the server reboot was done, all the instances came up properly. Started MRP and standby came insync.

Also you can reference to the below Metalink documents

Database startup Reports ORA-00600:[kfnsInstanceReg00] (Doc ID 1552422.1)
Bug 6934636 – Hang possible for foreground processes in ASM instance (Doc ID 6934636.8)
ORA-600 [kfnsInstanceReg00] and long waits on “ASM FILE METADATA OPERATION” (Doc ID 1480320.1)

Sangam 2014

I am back from AIOUG meet “SANGAM 14 – Meeting of Minds” and it was a wonderful experience. Had really nice time meeting some old friends and making few new ones:). I think i should mention here, finally i met Amit Bansal , mind behind http://askdba.org/ . It was fun meeting him:)

It was 3 day conference with

1st day :- Optimizer Master Class by Tom Kyte. It was full day seminar. Learned some new optimizer features of 12c. We all were very thankful to Tom for being in India for the 3rd time to present in Sangam.

2nd day :- The day started with “What You Need To Know About Oracle Database In-Memory Option” by Maria Colgan. If i need to describe in a word i would say “Awesome!!!!” . I loved every minute of the presentation and was well presented and very informative. Rest of the day i moved from room to room, attending some good sessions on 12c and optimization.

3rd day: – I wasn’t much serious on the 3rd day😀 . Mostly spent time meeting people around and discussing about oracle. What i loved on 3rd day was an hour session on “Time to Reinvent Yourself – Through Learning, Leading, and Failing” by Dr. Rajdeep Manwani. He shared his life experiences and some truth about human nature. It was amazing.

Overall it was nice to be part of Sangam 14 and learn some really new things mostly on 12c. Thanks to all the speakers and organizing committee for the efforts and valuable time they have put in.

Thanks all!!!!!