Exchange Partition – Exchanging Table Partitions

Recently I came across a requirement of moving table to new tablespace. The table had 12 range based partitions and each partition had ~8K list subpartitions and was storing historic data. After few discussion, it was decided to use exchange partition instead of dbms_redefinition.

As I did not have much practical experience in exchange partition, I thought to test it and I am just posting my test here in the blog

CREATE TABLE composite_rng_list (
cust_id NUMBER(10),
cust_name VARCHAR2(25),
cust_state VARCHAR2(2),
time_id DATE)
PARTITION BY RANGE(time_id)
SUBPARTITION BY LIST (cust_state)
SUBPARTITION TEMPLATE(
SUBPARTITION west VALUES ('OR', 'WA') TABLESPACE demo,
SUBPARTITION east VALUES ('NY', 'CT') TABLESPACE demo,
SUBPARTITION cent VALUES ('OK', 'TX') TABLESPACE demo) (
PARTITION per1 VALUES LESS THAN (TO_DATE('01/01/2014','DD/MM/YYYY')),
PARTITION per2 VALUES LESS THAN (TO_DATE('01/01/2015','DD/MM/YYYY')),
PARTITION per3 VALUES LESS THAN (TO_DATE('01/01/2016','DD/MM/YYYY')),
PARTITION future VALUES LESS THAN(MAXVALUE));

Table created.

SQL> insert into composite_rng_list values (1,'A','NY',to_date('01/02/2015','DD/MM/YYYY'));

1 row created.

Elapsed: 00:00:00.06
SQL> insert into composite_rng_list values (2,'B','CT',to_date('09/09/2014','DD/MM/YYYY'));

1 row created.

Elapsed: 00:00:00.06
SQL> insert into composite_rng_list values (3,'C','TX',to_date('10/10/2015','DD/MM/YYYY'));

1 row created.

Elapsed: 00:00:00.01
SQL> insert into composite_rng_list values (4,'D','WA',to_date('11/11/2013','DD/MM/YYYY'));

1 row created.

Elapsed: 00:00:00.01
SQL> insert into composite_rng_list values (5,'E','OK',to_date('21/02/2014','DD/MM/YYYY'));

1 row created.

Elapsed: 00:00:00.01
SQL> insert into composite_rng_list values (6,'F','OR',to_date('07/07/2013','DD/MM/YYYY'));

1 row created.

Elapsed: 00:00:00.00
SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats('ANAND','COMPOSITE_RNG_LIST');

PL/SQL procedure successfully completed.


SQL>  select table_owner,table_name,partition_name,subpartition_count,num_rows,tablespace_name from dba_tab_partitions where table_owner='ANAND' and table_name='COMPOSITE_RNG_LIST'
;

TABLE_OWN TABLE_NAME             PARTITION_NAME  SUBPARTITION_COUNT   NUM_ROWS TABLESPACE_NAME
--------- ---------------------- --------------- ------------------ ---------- ------------------------------
ANAND     COMPOSITE_RNG_LIST     FUTURE                           3          0 DEMO
ANAND     COMPOSITE_RNG_LIST     PER1                             3          2 DEMO
ANAND     COMPOSITE_RNG_LIST     PER2                             3          2 DEMO
ANAND     COMPOSITE_RNG_LIST     PER3                             3          2 DEMO

SQL> select table_owner,table_name,partition_name,subpartition_count from dba_tab_partitions where table_owner='ANAND' and table_name='COMPOSITE_RNG_LIST';

TABLE_OWN TABLE_NAME             PARTITION_NAME  SUBPARTITION_COUNT
--------- ---------------------- --------------- ------------------
ANAND     COMPOSITE_RNG_LIST     FUTURE                           3
ANAND     COMPOSITE_RNG_LIST     PER1                             3
ANAND     COMPOSITE_RNG_LIST     PER2                             3
ANAND     COMPOSITE_RNG_LIST     PER3                             3

SQL> select table_owner,table_name,partition_name,subpartition_name,high_value,num_rows from dba_tab_subpartitions  where table_owner='ANAND' and table_name='COMPOSITE_RNG_LIST';

TABLE_OWN TABLE_NAME             PARTITION_NAME  SUBPARTITION_NA HIGH_VALUE      NUM_ROWS
--------- ---------------------- --------------- --------------- --------------- ----------
ANAND     COMPOSITE_RNG_LIST     FUTURE          FUTURE_CENT     'OK', 'TX'          0
ANAND     COMPOSITE_RNG_LIST     FUTURE          FUTURE_EAST     'NY', 'CT'          0
ANAND     COMPOSITE_RNG_LIST     FUTURE          FUTURE_WEST     'OR', 'WA'          0
ANAND     COMPOSITE_RNG_LIST     PER3            PER3_CENT       'OK', 'TX'          1
ANAND     COMPOSITE_RNG_LIST     PER3            PER3_EAST       'NY', 'CT'          1
ANAND     COMPOSITE_RNG_LIST     PER3            PER3_WEST       'OR', 'WA'          0
ANAND     COMPOSITE_RNG_LIST     PER2            PER2_CENT       'OK', 'TX'          1
ANAND     COMPOSITE_RNG_LIST     PER2            PER2_EAST       'NY', 'CT'          1
ANAND     COMPOSITE_RNG_LIST     PER2            PER2_WEST       'OR', 'WA'          0
ANAND     COMPOSITE_RNG_LIST     PER1            PER1_CENT       'OK', 'TX'          0
ANAND     COMPOSITE_RNG_LIST     PER1            PER1_EAST       'NY', 'CT'          0
ANAND     COMPOSITE_RNG_LIST     PER1            PER1_WEST       'OR', 'WA'          2

Created a local and a global index on the table

SQL> create index i_crl_state on composite_rng_list(cust_state);

Index created.

SQL> create index i_crl_time on composite_rng_list(time_id) local;

Index created.

Index details

SQL> @index
Enter value for owner: ANAND
Enter value for table_name: composite_rng_list

TABLE_OWNER     TABLE_NAME            INDEX_OWNER     INDEX_NAME            TABLESPACE_NAME   NUM_ROWS      CLUST STATUS   INDEX_TYPE
--------------- --------------------- --------------- --------------------- --------------- ---------- ---------- -------- ------------
ANAND           COMPOSITE_RNG_LIST    ANAND           I_CRL_STATE           DEMO                     6          6 VALID    NORMAL
ANAND           COMPOSITE_RNG_LIST    ANAND           I_CRL_TIME                                     6          5 N/A      NORMAL

SQL> @partition_index
Enter value for index_name: I_CRL_TIME
Enter value for owner: ANAND

INDEX_NAME            PARTITION_NA STATUS   TABLESPACE_NAME INITIAL_EXTENT NEXT_EXTENT PCT_INCREASE   NUM_ROWS
--------------------- ------------ -------- --------------- -------------- ----------- ------------ ----------
I_CRL_TIME            PER1         N/A                                                                    2
I_CRL_TIME            PER2         N/A                                                                    2
I_CRL_TIME            PER3         N/A                                                                    2
I_CRL_TIME            FUTURE       N/A                                                                    0

Now lets create a table with which we would be exchanging the partition. Below I have created EXP_TEMP_PER1 table, which will exchange partition PER1 from COMPOSITE_RNG_LIST table. Note table is created as list partition (based on subpartition of original table). In case you create the table same as source table (range-list partition) and try exchange partition you would be hit with

ORA-14292: Partitioning type of table must match subpartitioning type of composite partition


SQL> create table ANAND.EXP_TEMP_PER1 (
   CUST_ID NUMBER(10)
   , CUST_NAME VARCHAR2(25)
   , CUST_STATE VARCHAR2(2)
   , TIME_ID DATE
 ) tablespace TEST
 partition by LIST (CUST_STATE)
 (
     partition PER1_WEST values ('OR', 'WA')
   , partition PER1_EAST values ('NY', 'CT')
   , partition PER1_CENT values ('OK', 'TX')
 )
 ;

Table created.

Insert the data, into the newly created table.

SQL> insert into EXP_TEMP_PER1 select * from composite_rng_list partition (per1);

2 rows created.

Elapsed: 00:00:00.26
SQL> commit;

Commit complete.

SQL> select * from composite_rng_list partition (per1);

   CUST_ID CUST_NAME                 CU TIME_ID
---------- ------------------------- -- ---------
         4 D                         WA 11-NOV-13
         6 F                         OR 07-JUL-13
SQL> select * from EXP_TEMP_PER1 ;

   CUST_ID CUST_NAME                 CU TIME_ID
---------- ------------------------- -- ---------
         4 D                         WA 11-NOV-13
         6 F                         OR 07-JUL-13

Note, the table created has partitions, no subpartitions

SQL> select table_owner,table_name,partition_name,subpartition_count,num_rows,tablespace_name from dba_tab_partitions where table_owner='ANAND' and table_name='EXP_TEMP_PER1';

TABLE_OWN TABLE_NAME             PARTITION_NAME  SUBPARTITION_COUNT   NUM_ROWS TABLESPACE_NAME
--------- ---------------------- --------------- ------------------ ---------- ---------------
ANAND     EXP_TEMP_PER1          PER1_WEST                        0          2 TEST
ANAND     EXP_TEMP_PER1          PER1_EAST                        0          0 TEST
ANAND     EXP_TEMP_PER1          PER1_CENT                        0          0 TEST

Elapsed: 00:00:00.10
SQL> select table_owner,table_name,partition_name,subpartition_name,high_value,num_rows from dba_tab_subpartitions where table_owner='ANAND' and table_name='EXP_TEMP_PER1';

no rows selected

Lets create local indexes similar to source table on the new table

SQL> create index i_exptper1_time on EXP_TEMP_PER1 (time_id) local;

Index created.

SQL> @partition_index
Enter value for index_name: i_exptper1_time
Enter value for owner: ANAND

INDEX_NAME            PARTITION_NA STATUS   TABLESPACE_NAME INITIAL_EXTENT NEXT_EXTENT PCT_INCREASE   NUM_ROWS
--------------------- ------------ -------- --------------- -------------- ----------- ------------ ----------
I_EXPTPER1_TIME       PER1_WEST    USABLE   TEST                   8388608     1048576                       2
I_EXPTPER1_TIME       PER1_EAST    USABLE   TEST                                                             0
I_EXPTPER1_TIME       PER1_CENT    USABLE   TEST                                                             0

Lets try the exchange partition

SQL> alter table COMPOSITE_RNG_LIST exchange partition per1 with table EXP_TEMP_PER1 including indexes with validation update global indexes;

Table altered.

SQL> select table_owner,table_name,partition_name,subpartition_name,high_value,num_rows,tablespace_name from dba_tab_subpartitions  where table_owner='ANAND' and table_name='COMPOSITE_RNG_LIST' and subpartition_name='PER1';

TABLE_OWN TABLE_NAME             PARTITION_NA SUBPARTITION HIGH_VALUE   NUM_ROWS TABLESPACE_NAME
--------- ---------------------- ------------ ------------ ---------- ---------- ---------------
ANAND     COMPOSITE_RNG_LIST     PER1         PER1_WEST    'OR', 'WA'          2 TEST
ANAND     COMPOSITE_RNG_LIST     PER1         PER1_EAST    'NY', 'CT'          0 TEST
ANAND     COMPOSITE_RNG_LIST     PER1         PER1_CENT    'OK', 'TX'          0 TEST

SQL> select index_name,partition_name,subpartition_name,tablespace_name,status from dba_ind_subpartitions where index_owner='ANAND' and index_name='I_CRL_TIME';

INDEX_NAME            PARTITION_NA SUBPARTITION TABLESPACE_NAME STATUS
--------------------- ------------ ------------ --------------- --------
I_CRL_TIME            FUTURE       FUTURE_CENT  DEMO            USABLE
I_CRL_TIME            FUTURE       FUTURE_EAST  DEMO            USABLE
I_CRL_TIME            FUTURE       FUTURE_WEST  DEMO            USABLE
I_CRL_TIME            PER3         PER3_CENT    DEMO            UNUSABLE
I_CRL_TIME            PER3         PER3_EAST    DEMO            UNUSABLE
I_CRL_TIME            PER3         PER3_WEST    DEMO            UNUSABLE
I_CRL_TIME            PER2         PER2_CENT    DEMO            UNUSABLE
I_CRL_TIME            PER2         PER2_EAST    DEMO            UNUSABLE
I_CRL_TIME            PER2         PER2_WEST    DEMO            UNUSABLE
I_CRL_TIME            PER1         PER1_CENT    TEST            USABLE
I_CRL_TIME            PER1         PER1_EAST    TEST            USABLE
I_CRL_TIME            PER1         PER1_WEST    TEST            USABLE

SQL> select table_owner,table_name,partition_name,subpartition_count,num_rows,tablespace_name from dba_tab_partitions where table_owner='ANAND' and table_name='COMPOSITE_RNG_LIST';


TABLE_OWN TABLE_NAME             PARTITION_NA SUBPARTITION_COUNT   NUM_ROWS TABLESPACE_NAME
--------- ---------------------- ------------ ------------------ ---------- ---------------
ANAND     COMPOSITE_RNG_LIST     PER1                          3          2 DEMO
ANAND     COMPOSITE_RNG_LIST     PER2                          3          2 DEMO
ANAND     COMPOSITE_RNG_LIST     PER3                          3          2 DEMO
ANAND     COMPOSITE_RNG_LIST     FUTURE                        3          0 DEMO

As we can see above the local indexes status is “USABLE”. For partitions PER3 and PER2 we see status “UNUSABLE” because I did not use “including indexes” attribute in exchange partition for them.

From Oracle Documentation

When you exchange partitions, logging attributes are preserved. You can optionally specify if local indexes are also to be exchanged (INCLUDING INDEXES clause), and if rows are to be validated for proper mapping (WITH VALIDATION clause).

When you specify WITHOUT VALIDATION for the exchange partition operation, this is normally a fast operation because it involves only data dictionary updates. However, if the table or partitioned table involved in the exchange operation has a primary key or unique constraint enabled, then the exchange operation is performed as if WITH VALIDATION were specified to maintain the integrity of the constraints.
To avoid the overhead of this validation activity, issue the following statement for each constraint before doing the exchange partition operation:

ALTER TABLE table_name
DISABLE CONSTRAINT constraint_name KEEP INDEX
Then, enable the constraints after the exchange.

If you specify WITHOUT VALIDATION, then you must ensure that the data to be exchanged belongs in the partition you exchange.

Unless you specify UPDATE INDEXES, the database marks UNUSABLE the global indexes or all global index partitions on the table whose partition is being exchanged. Global indexes or global index partitions on the table being exchanged remain invalidated. (You cannot use UPDATE INDEXES for index-organized tables. Use UPDATE GLOBAL INDEXES instead.)

DBA_TAB_SUPARTITIONS shows TEST tablespace, but still the view dba_tab_partitions, show DEMO tablespace for the partition. What needs to be done, so dba_tab_partitions show TEST tablespace for the partition? Please do respond :)

Reference

Exchanging Partitions

http://psoug.org/reference/partitions.html

Groups – OSDBA OSOPER OSASM

This one is quick and short. I have been asked few time, how can we check what value was specified to OSDBA, OSOPER and OSASM,especially during upgrades (to know what was it set to which initial installation)

# The DBA_GROUP is the OS group which is to be granted OSDBA privileges.
# The OPER_GROUP is the OS group which is to be granted OSOPER privileges.
# The OSASM_GROUP is the OS group which is to be granted OSASM privileges.

The below is from GRID_HOME :-

[grid@testdb1 dbs]$ grep “define SS_” $ORACLE_HOME/rdbms/lib/config.c
#define SS_DBA_GRP “asmdba”
#define SS_OPER_GRP “”
#define SS_ASM_GRP “asmadmin”

Even installation logfiles can be checked.

Silent Upgrade Oracle GoldenGate 12.1.2.0.0 to 12.1.2.1.2

Today I worked on silent upgrade of Oracle GoldenGate from 12.1.2.0.0 to 12.1.2.1.2, so thought to blog it which might help some of you. These are the steps I performed

Current GoldenGate Home –> /oracle/app/product/ogg12.1.2

1. Gather the details of GoldenGate Processes before stopping them

GGSCI> INFO EXTRACT EXTL, SHOWCH
INFO EXTRACT DPUMP, SHOWCH
GGSCI> SEND EXTRACT EXTL, SHOWTRANS
GGSCI> STOP EXTRACT DPUMP
GGSCI> STOP EXTRACT EXTL
GGSCI> SEND REPLICAT REPL STATUS
GGSCI> STOP REPLICAT REPL
GGSCI> STOP JAGENT
GGSCI> STOP MANAGER

2. Backup the existing binaries and associated files needed for your environment.

3. Edit the response file and run the runInstaller in silent mode

In the response file

#-------------------------------------------------------------------------------
# Specify the installation option.
# Specify ORA12c for installing Oracle GoldenGate for Oracle Database 12c and
#         ORA11g for installing Oracle GoldenGate for Oracle Database 11g
#-------------------------------------------------------------------------------
INSTALL_OPTION=ORA11g

#-------------------------------------------------------------------------------
# Specify a location to install Oracle GoldenGate
#-------------------------------------------------------------------------------
SOFTWARE_LOCATION=/oracle2/app/product/ogg12.1.2

#-------------------------------------------------------------------------------
# Specify true to start the manager after installation.
#-------------------------------------------------------------------------------
START_MANAGER=false

#-------------------------------------------------------------------------------
# Specify a free port within the valid range for the manager process.
# Required only if START_MANAGER is true.
#-------------------------------------------------------------------------------
MANAGER_PORT=

#-------------------------------------------------------------------------------
# Specify the location of the Oracle Database.
# Required only if START_MANAGER is true.
#-------------------------------------------------------------------------------
DATABASE_LOCATION=

#-------------------------------------------------------------------------------
# Specify the location which holds the install inventory files.
# This is an optional parameter if installing on
# Windows based Operating System.
#-------------------------------------------------------------------------------
INVENTORY_LOCATION=/oracle/oraInventory

#-------------------------------------------------------------------------------
# Unix group to be set for the inventory directory.
# This parameter is not applicable if installing on
# Windows based Operating System.
#-------------------------------------------------------------------------------
UNIX_GROUP_NAME=oinstall

Run the runInstaller in Silent mode

[oracle@oracle gg]$ export ORACLE_HOME=/oracle/app/product/ogg12.1.2
[oracle@oracle gg]$ export PATH=$ORACLE_HOME/bin:$ORACLE_HOME/OPatch:$PATH
[oracle@oracle gg]$ export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
[oracle@oracle Disk1]$ ./runInstaller -silent -showProgress -waitforcompletion -responseFile /opt/oracle/ogg_upg.rsp
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB.   Actual 181836 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 11999 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2015-04-21_12-23-03AM. Please wait ...You can find the log of this install session at:
 /oracle/oraInventory/logs/installActions2015-04-21_12-23-03AM.log

Prepare in progress.
..................................................   10% Done.

Prepare successful.

Copy files in progress.
..................................................   44% Done.
..................................................   50% Done.
..................................................   60% Done.
..................................................   65% Done.
..................................................   72% Done.
....................
Copy files successful.

Link binaries in progress.
..........
Link binaries successful.
..................................................   82% Done.

Setup files in progress.
..................................................   100% Done.

Setup files successful.
The installation of Oracle GoldenGate Core was successful.
Please check '/oracle/oraInventory/logs/silentInstall2015-04-21_12-23-03AM.log' for more details.
Successfully Setup Software.
[oracle@oracle1 Disk1]$

4. Confirm GoldenGate has been upgraded

[oracle@oracle Disk1]$ cd $ORACLE_HOME
[oracle@oracle ogg12.1.2]$
[oracle@oracle ogg12.1.2]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 12.1.2.1.0 OGGCORE_12.1.2.1.0_PLATFORMS_140727.2135.1_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Aug  7 2014 09:14:25
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2014, Oracle and/or its affiliates. All rights reserved.

5. Apply Patch 20265694: Oracle GoldenGate V12.1.2.1.2 for Oracle 11G

[oracle@oracle1 20265694]$ opatch apply
Invoking OPatch 11.2.0.1.7

Oracle Interim Patch Installer version 11.2.0.1.7
Copyright (c) 2011, Oracle Corporation.  All rights reserved.


Oracle Home       : /oracle/app/product/ogg12.1.2
Central Inventory : /oracle/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.7
OUI version       : 11.2.0.3.0
Log file location : /oracle/app/product/ogg12.1.2/cfgtoollogs/opatch/opatch2015-04-21_00-24-15AM.log

Applying interim patch '20265694' to OH '/oracle/app/product/ogg12.1.2'
Verifying environment and performing prerequisite checks...

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/oracle/app/product/ogg12.1.2')


Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...

Patching component oracle.oggcore.ora11g, 12.1.2.1.0...
Patch 20265694 successfully applied
Log file location: /oracle/app/product/ogg12.1.2/cfgtoollogs/opatch/opatch2015-04-21_00-24-15AM.log

OPatch succeeded.
[oracle@oracle1 20265694]$
[oracle@oracle1 20265694]$ opatch lsinv
Invoking OPatch 11.2.0.1.7

Oracle Interim Patch Installer version 11.2.0.1.7
Copyright (c) 2011, Oracle Corporation.  All rights reserved.
........................
........................
Patch  20265694     : applied on Tue Apr 21 00:24:27 EDT 2015
Unique Patch ID:  18593256
   Created on 3 Feb 2015, 15:03:21 hrs PST8PDT
   Bugs fixed:
     19818362, 19602692, 19241234, 17423191, 19781984, 19535319, 19724915
     19721652, 19516537, 19441114, 19132627, 19889991, 19681035

6. Confirm the GoldenGate version is now 12.1.2.1.2

[oracle@oracle ogg12.1.2]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 12.1.2.1.2 20133048 OGGCORE_12.1.2.1.0OGGBP_PLATFORMS_141228.0533_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Dec 28 2014 13:19:44
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2015, Oracle and/or its affiliates. All rights reserved.

7. Start the GoldenGate Processes


GGSCI (oracle.test.com) 1> dblogin userid ggate, Password "xxxxxxxx"
Successfully logged into database.

GGSCI> START MANAGER
GGSCI> START JAGENT
GGSCI> START EXTRACT EXTL
GGSCI> START EXTRACT DPUMP
GGSCI> START REPLICAT REPL

ORA-00353: log corruption near block change time – 11.2.0.3

Recently I came across “ORA-00353: log corruption near block change time” error.

Errors in file /oracle/app/diag/rdbms/test/test/trace/test_ms00_54284.trc  (incident=1490189):
ORA-00353: log corruption near block  change  time
ORA-00353: log corruption near block 4631374 change 263470473806052 time 04/16/2015 06:53:16
ORA-00312: online log 43 thread 3: '+REDO02/test/onlinelog/group_43.258.875922665'
ORA-00312: online log 43 thread 3: '+REDO01/test/onlinelog/group_43.258.875922633'
Incident details in: /oracle/app/diag/rdbms/test/test/incident/incdir_1490189/test_ms00_54284_i1490189.trc
Thu Apr 16 08:14:42 2015
Sweep [inc][1490188]: completed
Thu Apr 16 08:14:55 2015
LOGMINER: WARNING: error 353 encountered, failed to read corrupt logfile +REDO01/test/onlinelog/group_43.258.875922633

As per ORA-354 ORA-353 and ORA-312 on Redo Log Group members (Doc ID 332672.1)

Possible corruption in Online Redo Log File Members in a Redo Log Group.

If Oracle detects a corruption when archiving a log member, it will try to read the same redo block from the second member of the group. If the block is found corrupt in all the members of the redo log group, archiving will not proceed.

I started validating the online redo logfile member and luckily none of the members were corrupt. Incase you find corrupt members please refer to above mentioned Doc “Doc ID 332672.1″ for solution.

As a precaution I validated all the online redo logfiles to confirm for corruption. If the redo log file is corrupt it should produce an error.

SQL> alter system dump logfile '+REDO01/test/onlinelog/group_43.258.875922633' validate;

System altered.

Elapsed: 00:01:21.89

SQL> alter system dump logfile '+REDO02/test/onlinelog/group_43.258.875922665' validate;

System altered.

Elapsed: 00:01:10.61

Interestingly I noticed trace file “test_ms00_54284.trc” of size 49Gb and continuously increasing. The file was being continuously written by GoldenGate process.

The issue seemed similar to one mentioned in

Bug 13840711 – ORA-353 in Standby / Streams Data Capture or ORA-272 in PRIMARY: Redo log corruption by ASYNC redo shipping (Doc ID 13840711.8)

Corrupt redo blocks are read from in memory log buffer (not from disk) by ASYNC redo shipping or hotmining clients.

ORA-353 may be reported by the process reading the redo log block from in memory log buffer (e.g. Logminer Reader – MSnn background process, Streams data capture).

Looking at the goldengate status showed

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
JAGENT      RUNNING
EXTRACT     RUNNING     DPUMP       00:00:00      00:00:03
EXTRACT     RUNNING     EXTL        02:05:36      00:00:07
REPLICAT    RUNNING     REPL        00:00:00      00:00:06

Meanwhile we received another error, this time the extract process which failed reading from online redo logfile, failed reading from archivelog file too. I checked the standby database, and it was insync with Primary.

Errors in file /oracle/app/diag/rdbms/test/test/trace/test_ms00_54284.trc (incident=1490190):
ORA-00353: log corruption near block 4631374 change 605163911323 time 01/06/1988 07:26:12
ORA-00334: archived log: ‘+FLASH01/test/archivelog/2015_04_16/thread_3_seq_9370.7790.877158359′

Incident details in: /oracle/app/diag/rdbms/test/test/incident/incdir_1490190/test_ms00_54284_i1490190.trc
Thu Apr 16 09:19:13 2015
Dumping diagnostic data in directory=[cdmp_20150416091913], requested by (instance=1, osid=54284 (MS00)), summary=[incident=1490190].
Thu Apr 16 09:19:27 2015
Errors in file /oracle/app/diag/rdbms/test/test/trace/test_ms00_54284.trc (incident=1490191):
ORA-00356: inconsistent lengths in change description
ORA-00353: log corruption near block 4631374 change 605163911323 time 01/06/1988 07:26:12
ORA-00334: archived log: ‘+FLASH01/test/archivelog/2015_04_16/thread_3_seq_9370.7790.877158359′
Incident details in: /oracle/app/diag/rdbms/test/test/incident/incdir_1490191/test_ms00_54284_i1490191.trc

At this point, client stopped all the GoldenGate processes, deleted the trace file and started the processes.I started monitoring the Extract Process

 send extract EXTL status

Sending STATUS request to EXTRACT EXTL ...


EXTRACT EXTL (PID 102390)
  Current status: In recovery[1]: Processing data

  Current read position:
  Redo thread #: 3
  Sequence #: 9351
  RBA: 4023516192
  Timestamp: 2015-04-16 03:37:35.000000
  SCN: 140.3838686301
  Current write position:
  Sequence #: 1637
  RBA: 1551
  Timestamp: 2015-04-16 09:31:56.423132
  Extract Trail: /oracle2/app/product/ogg12.1.2/dirdat/capture/sd

Monitored the Extract, and it was successfully able to read and extract from SEQ 9370, for which it had earlier failed.

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
JAGENT      RUNNING
EXTRACT     RUNNING     DPUMP       00:00:01      00:00:07
EXTRACT     RUNNING     EXTL        00:00:01      00:00:00
REPLICAT    RUNNING     REPL        00:00:00      00:00:06

send extract EXTL status

Sending STATUS request to EXTRACT EXTL ...


EXTRACT EXTL (PID 102390)
  Current status: Recovery complete: Processing data

  Current read position:
  Redo thread #: 3
  Sequence #: 9395
  RBA: 1721029512
  Timestamp: 2015-04-16 12:08:00.000000
  SCN: 140.3917853576
  Current write position:
  Sequence #: 1641
  RBA: 997030325
  Timestamp: 2015-04-16 12:07:57.056673
  Extract Trail: /oracle2/app/product/ogg12.1.2/dirdat/capture/sd

lag extract EXTL

Sending GETLAG request to EXTRACT EXT1OL ...
Last record lag 0 seconds.

lag extract DPUMP

Sending GETLAG request to EXTRACT DPUMP ...
Last record lag 2 seconds.
At EOF, no more records to process.

Current DB version is 11.2.0.3.13. As mentioned in the doc, fix for bug 13840711 is first included in 11.2.0.3.14 (Apr 2015) Database Patch Set Update (DB PSU) , we do plan to apply it in coming weeks.

ORA-01628: max # extents (32765) reached for rollback segment _SYSSMUxxx

Recently I came across “ORA-01628: max # extents (32765)” error. The database had resumable_timeout parameter set I received page for a session in resumable state. Looking into the alert log showed

statement in resumable session ‘User USER_E3(375), Session 9028, Instance 1′ was suspended due to
ORA-01628: max # extents (32765) reached for rollback segment _SYSSMU375_247595031$
Wed Apr 08 02:41:42 2015

Looking at the sid# 9028 details, INSERT sql was running and the session was on waiting on “statement suspended, wait error to be cl” as resumable_timeout parameter was set.

PRD04> @undo_stats
 
Undo Stats for Automatic Undo Management for Last 3 Hours
 
BEGIN_TIME         END_TIME           ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION   TXNCOUNT MAXQUERYLEN NOSPACEERRCNT
------------------ ------------------ ---------- ------------- ----------- ------------------- ---------- ----------- -------------
07-APR-15          08-APR-15              207496      22622976     6746320               46384     670317       11587             0
08-APR-15          08-APR-15              204488      22562776     6797160               46215     736330       10327             0
08-APR-15          08-APR-15              119832      22329656     7295296               46251     645728       10934             0
08-APR-15          08-APR-15              171992      22368920     7304336               46010     532932       11539             0
08-APR-15          08-APR-15              109968      22591672     7245352               46057     649534       12145             0
08-APR-15          08-APR-15              113016      22505688     7349552               45925     679114       12751             0
08-APR-15          08-APR-15              218048      22355184     7317048               46147     665002       13357             0
08-APR-15          08-APR-15              242368      22534704     7360800               46188     695974       13962             0
08-APR-15          08-APR-15              298368      22489016     7546968               45427     622305        5092             0
08-APR-15          08-APR-15               98792      22417864     7941104               44771     587418         702             0
08-APR-15          08-APR-15              151888      22744560     7860176               43790     703158        1308             0
08-APR-15          08-APR-15              353272      22606848     8085272               42483     670949        1913             0
08-APR-15          08-APR-15              679376      21995016     8654552               40691     662446        1245             0
08-APR-15          08-APR-15             1172864      21419400     8860792               39531     780894        1851             0
08-APR-15          08-APR-15             1093784      21696040     8664424               36463          0        2457             0
08-APR-15          08-APR-15             1374528      21315992     8761640               33362     733162        3062             0
08-APR-15          08-APR-15             1433128      21166848     8852944               31892     755982        3668           158
08-APR-15          08-APR-15             1454984      21040312     8960960               31767     692704        4274           293
08-APR-15          08-APR-15             1454984      21040312     8960960               31596     266300        4577           108
 
19 rows selected.

Transactions Using Rolback:
 
RBS        OS User         DB User         SID   Schema     Object Name          Type       Used RBS Blocks  # of Records
---------- --------------- --------------- ----- ---------- -------------------- ---------- ---------------- ------------------
...................
...................
_SYSSMU375 root            USER_E3         9028  ADMIN_E TESTS             TABLE PART 385255           23794634
_SYSSMU375 root            USER_E3         9028  ADMIN_E TESTS             TABLE      385255           23794634
...................
...................

26 rows selected.

 
PRD04> show parameter undo
 
NAME                                 TYPE       VALUE
------------------------------------ ---------- ------------------------------
undo_management                      string     AUTO
undo_retention                       integer    7200
undo_tablespace                      string     UNDOTBS4

PRD04> @undo_extent_stats
 
STATUS                                 MB       PERC
------------------------------ ---------- ----------
ACTIVE                              11741          3
EXPIRED                            225670         49
UNEXPIRED                          170579         37

Looking at the sessions with active transactions

PRD04> SELECT s.sid, s.serial#, s.username, u.segment_name, count(u.extent_id) "Extent Count", t.used_ublk, t.used_urec, s.program
FROM v$session s, v$transaction t, dba_undo_extents u
WHERE s.taddr = t.addr and u.segment_name like '_SYSSMU'||t.xidusn||'_%$' and u.status = 'ACTIVE'
GROUP BY s.sid, s.serial#, s.username, u.segment_name, t.used_ublk, t.used_urec, s.program
ORDER BY t.used_ublk desc, t.used_urec desc, s.sid, s.serial#, s.username, s.program;
 
    SID    SERIAL# USERNAME               SEGMENT_NAME                   Extent Count  USED_UBLK  USED_UREC PROGRAM
------- ---------- ---------------------- ------------------------------ ------------ ---------- ---------- ------------------------------------------------
   9028      15075 USER_E3                _SYSSMU375_247595031$                 32726     389285   24043132 php@script01.example.com
   2317      38231 USER_E2                _SYSSMU344_2152184027$                   11       8303     509953 php@script06.example.com
   8930      34713 USER_E3                _SYSSMU366_2128258416$                    6       4930     401890 php@script02.example.com
   5469       3267 USER_E4                _SYSSMU362_25371301$                      3       1704     116802 php@script02.example.com
..........
..........

13 rows selected

As per Metalink Doc “Troubleshooting ORA-1628 – max # extents (32765) reached for rollback segment (Doc ID 1580182.1)”

With automatic undo, you have no control over the extent sizes, nor do you have the ability to shrink them. This all happens automatically. The system automatically decides on the extent size, however it will generally start allocating larger extents if an undo segment is extending a lot. The large number of extents in the rollback segment is likely due to fragmentation in the undo tablespace: Oracle can probably only allocate extents of 64k due to fragmentation, so it is very probable to hit the max extents issue.

The maximum number of extents for undo segments is limited to 32K and a long/large running transaction can exhaust this limit by adding new extents if the next extent of the current one is not an expired one and finally will receive ORA-1628.

So, after getting the ORA-1628 errors in the transaction that extended the undo segment until its limit, future transactions will not be allowed to bind to the undo segment until it is not shrinked (you may see that the number of extents is decreased).

So, The two major causes of ORA-1628 issue are a very large transaction or undo tablespace fragmentation.

In case of large transaction, That is solved by splitting the large transaction to smaller ones (e.g. frequent commits).

In case of undo tablespace fragmentation, That is solved by recreating the undo tablespace (this is also the recommended solution of Bug 10330444 and Bug 10229998 which were filed for the same issue and closed as not a bug).

The UNDO tablespace had around 10Gb free, so adding space to UNDO wouldn’t resolve the issue. We informed the client’s Dev team and it was taken care.

Below are few sqls related to UNDO which I think can be useful while diagnosing issues


col "Parameter" format A32
col "Session Value" format A32
col "Instance Value" format A32
select a.inst_id, a.ksppinm "Parameter",
b.ksppstvl "Session Value",
c.ksppstvl "Instance Value"
from x$ksppi a, x$ksppcv b, x$ksppsv c
where a.indx = b.indx and a.indx = c.indx
and a.inst_id=b.inst_id and b.inst_id=c.inst_id
and a.ksppinm in ('_smu_debug_mode', '_highthreshold_undoretention')
order by 2;

   INST_ID Parameter			    Session Value		     Instance Value
---------- -------------------------------- -------------------------------- --------------------------------
	 1 _highthreshold_undoretention     4294967294			     4294967294
	 1 _smu_debug_mode		    0				     0


select segment_name,
   round(nvl(sum(act),0)/(1024*1024*1024),3 ) "ACT GB BYTES",
   round(nvl(sum(unexp),0)/(1024*1024*1024),3) "UNEXP GB BYTES",
   round(nvl(sum(exp),0)/(1024*1024*1024),3) "EXP GB BYTES",
   NO_OF_EXTENTS
   from ( select segment_name, nvl(sum(bytes),0) act,00 unexp, 00 exp, count(*) NO_OF_EXTENTS
   from DBA_UNDO_EXTENTS
   where status='ACTIVE' and tablespace_name = 'UNDOTBS4'
   group by segment_name
   union
   select segment_name,00 act, nvl(sum(bytes),0) unexp, 00 exp , count(*) NO_OF_EXTENTS
   from DBA_UNDO_EXTENTS
   where status='UNEXPIRED' and tablespace_name = 'UNDOTBS4'
   group by segment_name
   union
   select segment_name, 00 act, 00 unexp, nvl(sum(bytes),0) exp, count(*) NO_OF_EXTENTS
   from DBA_UNDO_EXTENTS
   where status='EXPIRED' and tablespace_name = 'UNDOTBS4'
   group by segment_name
   ) group by segment_name, NO_OF_EXTENTS having NO_OF_EXTENTS >= 30 order by 5 desc;

SEGMENT_NAME		       ACT GB BYTES UNEXP GB BYTES EXP GB BYTES NO_OF_EXTENTS
------------------------------ ------------ -------------- ------------ -------------
_SYSSMU375_247595031$			  0	    10.082	      0 	32765
_SYSSMU313_3664260743$			  0	     1.209	      0 	 2925
_SYSSMU316_1833322353$			  0		 0	   .607 	 2299
_SYSSMU367_1976067076$			  0		 0	   .788 	 2003
_SYSSMU577_2158293664$			  0		 0	  1.186 	 1710
_SYSSMU503_170622332$			  0		 0	   .604 	 1617
_SYSSMU330_231237204$			  0	      .899	      0 	 1523
_SYSSMU336_3955560249$			  0	     1.375	      0 	 1240
_SYSSMU5059_1468117272$ 		  0		.5	      0 	 1089
_SYSSMU306_1485117688$			  0	    12.145	      0 	  964

We can now take look at extent details for the specific undo segment which caused error. The undo segment was primarily made up of tiny (64 KB) extents

PRD04> break on report
PRD04> compute sum label Total of Extent_Count Extent_MB on report
PRD04> col Extent_MB format 999,999.00
PRD04> SELECT segment_name, bytes/1024 "Extent_Size_KB", count(extent_id) "Extent_Count", bytes * count(extent_id) / power(1024, 2) "Extent_MB" FROM dba_undo_extents WHERE segment_name = '_SYSSMU375_247595031$' group by segment_name, bytes order by 1, 3 desc;

SEGMENT_NAME		       Extent_Size_KB Extent_Count   Extent_MB
------------------------------ -------------- ------------ -----------
_SYSSMU375_247595031$			   64	     31858    1,991.13
_SYSSMU375_247595031$			 8192	       493    3,944.00
_SYSSMU375_247595031$			 4096	       178	712.00
_SYSSMU375_247595031$			 2048		23	 46.00
_SYSSMU375_247595031$			 1024		22	 22.00
_SYSSMU375_247595031$			16384		22	352.00
_SYSSMU375_247595031$			65536		17    1,088.00
_SYSSMU375_247595031$			 3072		12	 36.00
_SYSSMU375_247595031$			24576		12	288.00
_SYSSMU375_247595031$			12288		10	120.00
_SYSSMU375_247595031$			49152		 8	384.00
_SYSSMU375_247595031$			 1152		 7	  7.88
_SYSSMU375_247595031$			 1088		 6	  6.38
_SYSSMU375_247595031$			 3136		 6	 18.38
_SYSSMU375_247595031$			32768		 5	160.00
_SYSSMU375_247595031$			57344		 5	280.00
_SYSSMU375_247595031$			20480		 5	100.00
_SYSSMU375_247595031$			28672		 5	140.00
_SYSSMU375_247595031$			40960		 4	160.00
_SYSSMU375_247595031$			 1216		 4	  4.75
_SYSSMU375_247595031$			 6144		 4	 24.00
_SYSSMU375_247595031$			 3264		 3	  9.56
_SYSSMU375_247595031$			 3200		 3	  9.38
_SYSSMU375_247595031$			 5248		 2	 10.25
_SYSSMU375_247595031$			 2240		 2	  4.38
_SYSSMU375_247595031$			 3776		 2	  7.38
_SYSSMU375_247595031$			 2368		 2	  4.63
_SYSSMU375_247595031$			 2112		 2	  4.13
_SYSSMU375_247595031$			 5120		 2	 10.00
_SYSSMU375_247595031$			 1344		 2	  2.63
_SYSSMU375_247595031$			 3328		 2	  6.50
_SYSSMU375_247595031$			 2752		 2	  5.38
_SYSSMU375_247595031$			 3392		 2	  6.63
_SYSSMU375_247595031$			 2432		 2	  4.75
_SYSSMU375_247595031$			 7616		 1	  7.44
_SYSSMU375_247595031$			36864		 1	 36.00
_SYSSMU375_247595031$			 3584		 1	  3.50
_SYSSMU375_247595031$			 4032		 1	  3.94
_SYSSMU375_247595031$			 2304		 1	  2.25
_SYSSMU375_247595031$			 7872		 1	  7.69
_SYSSMU375_247595031$			 1280		 1	  1.25
_SYSSMU375_247595031$			11776		 1	 11.50
_SYSSMU375_247595031$			 5440		 1	  5.31
_SYSSMU375_247595031$			 7360		 1	  7.19
_SYSSMU375_247595031$			11264		 1	 11.00
_SYSSMU375_247595031$			 3712		 1	  3.63
_SYSSMU375_247595031$			35520		 1	 34.69
_SYSSMU375_247595031$			14336		 1	 14.00
_SYSSMU375_247595031$			59648		 1	 58.25
_SYSSMU375_247595031$			 6400		 1	  6.25
_SYSSMU375_247595031$			 2560		 1	  2.50
_SYSSMU375_247595031$			 3456		 1	  3.38
_SYSSMU375_247595031$			37888		 1	 37.00
_SYSSMU375_247595031$			 9536		 1	  9.31
_SYSSMU375_247595031$			 6336		 1	  6.19
_SYSSMU375_247595031$			15168		 1	 14.81
_SYSSMU375_247595031$			14400		 1	 14.06
_SYSSMU375_247595031$			 3520		 1	  3.44
_SYSSMU375_247595031$			 6528		 1	  6.38
_SYSSMU375_247595031$			19520		 1	 19.06
_SYSSMU375_247595031$			 1600		 1	  1.56
_SYSSMU375_247595031$			 2176		 1	  2.13
_SYSSMU375_247595031$			 6272		 1	  6.13
_SYSSMU375_247595031$			 9344		 1	  9.13
_SYSSMU375_247595031$			 5376		 1	  5.25
					      ------------ -----------
Total						     32765   10,324.25

65 rows selected.

ORA-17629 ORA-20079

Yesterday I received page for ” ORA-17629: Cannot connect to the remote database server”  reported in Primary Database alert log.

Fri Apr 10 06:47:41 2015
Errors in file /oracle/app/diag/rdbms/testdb/testdb/trace/testdb_ora_161991.trc:
ORA-17627:
ORA-17629: Cannot connect to the remote database server
Errors in file /oracle/app/diag/rdbms/testdb/testdb/trace/testdb_ora_161991.trc:
ORA-17629: Cannot connect to the remote database server
ORA-17627:
ORA-17629: Cannot connect to the remote database server
Fri Apr 10 06:48:22 2015

The error trace file shows

*** 2015-04-10 06:47:13.005
*** SESSION ID:(3024.45081) 2015-04-10 06:47:13.005

*** CLIENT ID:() 2015-04-10 06:47:13.005
*** SERVICE NAME:(testdb) 2015-04-10 06:47:13.005
*** MODULE NAME:(oracle@oracle2.testdbds.com (TNS V1-V3)) 2015-04-10 06:47:13.005

*** ACTION NAME:() 2015-04-10 06:47:13.005

krbmsrvgrcf:Expected db_unique_name: testdb
krbmsrvgrcf:mydbuname = :testdb:
krbmsrvgrcf:Instance with connect id testdbSBY requested controlfile.
krbmsrvgrcf:Filespec: /oracle/app/product/11.2_1/dbs/snappcf_testdb.file

*** 2015-04-10 06:47:41.467
OCI error val is 184283056 and errmsg is ”
ORA-17627:
ORA-17629: Cannot connect to the remote database server
******************** WARNING ***************************
The errors during Server autobackup are not fatal, as it
is attempted after sucessful completion of the command.
However, it is recomended to take an RMAN control file
backup as soon as possible because the Autobackup failed
with the following error:
ORA-17629: Cannot connect to the remote database server
ORA-17627:
ORA-17629: Cannot connect to the remote database server
******************** END OF WARNING *******************
ORA-17629: Cannot connect to the remote database server
ORA-17627:
ORA-17629: Cannot connect to the remote database server

The trace file mentions the SID,SERIAL# and the module name. The module shows “oracle@oracle2.testdbds.com” which is the standby database.

I thought to check the sid details from v$active_session_history and the session is coming in from standby database and definitely related to RMAN.

SQL> @ash_sid
Enter value for sid: 3024
Enter value for serial: 45081
old  17: where    session_id = &sid and session_serial# = &serial
new  17: where    session_id = 3024 and session_serial# = 45081

                                                                         SQL                                                                                        Dur   CPU DB Time
Sid,Serial<Blk  STIME    Module               SqlId:ChildNo    SqlExecId Start CPH S PRXJBZ Sta    SEQ# EVENT                            P1:P2:P3   WCLASS           ms    ms      ms
--------------- -------- -------------------- ---------------- --------- ----- ------------ ---- ------ ------------------------ ------------------ ----------- ------- ----- -------
3024,45081      06:47:13 oracle@oracle2.testdbds.com                           NNN N NNNNN  Wait    129 control file single writ 9999999:  1024:  9 System I/O      105     8     113
                06:47:14                                                       NNN N NNNNN  Wait    269 control file single writ 9999999:  5504:  9 System I/O
                06:47:15                                                       NNN N NNNNN  Wait    395 control file single writ 9999999:  9536:  9 System I/O
                06:47:16                                                       NNN N NNNNN  Wait    473 control file single writ 9999999: 12032:  9 System I/O
                06:47:17                                                       NNN N NNNNN  Wait    560 control file sequential  9999999: 14848:  9 System I/O
                06:47:18                                                       NNN N NNNNN  Wait    673 control file single writ 9999999: 18432:  9 System I/O
                06:47:19                                                       NNN N NNNNN  Wait    768 control file sequential  9999999: 21504:  9 System I/O
                06:47:20                                                       NNN N NNNNN  Wait    863 control file single writ 9999999: 24512:  9 System I/O
                06:47:21                                                       NNN N NNNNN  Wait    959 control file single writ 9999999: 27584:  9 System I/O
                06:47:22                                                       NNN N NNNNN  cpu    1044                          9999999: 30336:  9
                06:47:23                                                       NNN N NNNNN  cpu    1132                          9999999: 33152:  9
                06:47:24                                                       NNN N NNNNN  Wait   1220 control file sequential  9999999: 35968:  9 System I/O
                06:47:25                                                       NNN N NNNNN  Wait   1318 control file sequential  9999999: 39104:  9 System I/O
                06:47:26                                                       NNN N NNNNN  Wait   1405 control file single writ 9999999: 41856:  9 System I/O
                06:47:27                                                       NNN N NNNNN  Wait   1489 control file single writ 9999999: 44544:  9 System I/O
                06:47:28                                                       NNN N NNNNN  Wait   1567 control file single writ 9999999: 47040:  9 System I/O
                06:47:29                                                       NNN N NNNNN  Wait   1649 control file single writ 9999999: 49664:  9 System I/O
                06:47:30                                                       NNN N NNNNN  Wait   1733 control file single writ 9999999: 52352:  9 System I/O
                06:47:31                                                       NNN N NNNNN  cpu    1817                          9999999: 55040:  9
                06:47:32                                                       NNN N NNNNN  Wait   1895 control file single writ 9999999: 57536:  9 System I/O
                06:47:33                                                       NNN N NNNNN  Wait   1965 control file single writ 9999999: 59776:  9 System I/O
                06:47:34                                                       NNN N NNNNN  Wait   2041 control file single writ 9999999: 62208:  9 System I/O
                06:47:35                                                       NNN N NNNNN  Wait   2113 control file single writ 9999999: 64512:  9 System I/O
                06:47:36                                                       NNN N NNNNN  Wait   2179 control file single writ 9999999: 66624:  9 System I/O
                06:47:37                                                       NNN N NNNNN  Wait   2238 control file sequential  9999999: 68544:  9 System I/O
                06:47:38                                                       NNN N NNNNN  Wait   2307 control file single writ 9999999: 70720:  9 System I/O
                06:47:39                                                       NNN N NNNNN  Wait   2369 control file single writ 9999999: 72704:  9 System I/O
                06:47:40                                                       NNN N NNNNN  Wait   2444 control file sequential  9999999: 75136:  9 System I/O


28 rows selected.

Looking into MOS I found “RMAN-06820 ORA-17629 During Backup at Standby Site (Doc ID 1616074.1)”. It stated something interesting, so I thought to mention it here, which I did not know

Change in 11.2.0.4 onward

Per ‘unpublished’ Bug 8740124, as of 11.2.0.4, we now include the current standby redo log as part of an RMAN archivelog backup at the standby site. This is achieved by forcing a log switch at the primary site.

I looked at the archive log backup trace file on standby database and there was no error reported. Then I started looking into the backup script and saw, after the backup is completed, resync function is called which connects to rcat database and perform resync catalog.

cle2.testdbs logs]$ more testdb_backup_testdb_resync_201504100617_Fri.log

Recovery Manager: Release 11.2.0.3.0 – Production on Fri Apr 10 06:46:55 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

RMAN>
echo set on

RMAN>

RMAN> connect target *
connected to target database: testdb (DBID=3948365078)

RMAN> connect catalog *
connected to recovery catalog database

RMAN> resync catalog;
starting partial resync of recovery catalog
ORA-20079: full resync from primary database is not done

doing automatic resync from primary
resyncing from database with DB_UNIQUE_NAME testdb
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of resync command on default channel at 04/10/2015 06:47:13
ORA-17629: Cannot connect to the remote database server

ORA-17628: Oracle error 17629 returned by remote Oracle server

RMAN> exit;

Recovery Manager complete.

Tried “show all for db_unique_name testdb;” from standby

RMAN> show all for db_unique_name testdb;

ORA-20079: full resync from primary database is not done

doing automatic resync from primary
resyncing from database with DB_UNIQUE_NAME testdb
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of show command at 04/10/2015 07:53:40
RMAN-03014: implicit resync of recovery catalog failed
RMAN-03009: failure of partial resync command on default channel at 04/10/2015 07:53:40
ORA-17629: Cannot connect to the remote database server
ORA-17628: Oracle error 17629 returned by remote Oracle server

Connected to Primary Database and ran the same command

[oracle@oracle1 anand]$ rman target /

Recovery Manager: Release 11.2.0.3.0 – Production on Fri Apr 10 07:55:43 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

connected to target database: testdb (DBID=3948365078)

RMAN> connect catalog rcat/xxxxxxxx@rcat

connected to recovery catalog database

RMAN> show all for db_unique_name testdb;

starting full resync of recovery catalog

full resync complete
RMAN configuration parameters for database with db_unique_name testdb are:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 10 DAYS;
CONFIGURE BACKUP OPTIMIZATION OFF;
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO ‘%F';
CONFIGURE DEVICE TYPE DISK PARALLELISM 8 BACKUP TYPE TO BACKUPSET;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1;
CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT ‘/mnt/oracle-backup/testdb/%U';
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM ‘AES192′;
CONFIGURE COMPRESSION ALGORITHM ‘BASIC’ AS OF RELEASE ‘DEFAULT’ OPTIMIZE FOR LOAD FALSE;
CONFIGURE DB_UNIQUE_NAME ‘testdbSBY’ CONNECT IDENTIFIER ‘testdbSBY';
CONFIGURE DB_UNIQUE_NAME ‘testdb’ CONNECT IDENTIFIER ‘testdb';
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE;
CONFIGURE SNAPSHOT CONTROLFILE NAME TO ‘/oracle/app/product/11.2_1/dbs/snappcf_testdb.file';

RMAN>
RMAN> exit

The full resync completed on Primary database and then I performed “resync catalog” on standby as it completed successfully.

RMAN> resync catalog
2> ;

starting partial resync of recovery catalog
partial resync complete

RMAN> exit

Primary on FileSystem and Standby on ASM

For one of the client, standby server went down. We had another standby server which was kept down for more than a month. Decision was taken to start the server and apply incremental SCN based backup on the standby database.

The standby was on ASM and the Primary on filesystem.Incremental backup was started from the SCN reported by below query

select min(fhscn) from x$kcvfh;

Once the backup completed, it was transferred to standby, standby was mounted (using the old controlfile), backups were cataloged and recovery performed using ‘recover database noredo’.

The recovery was going on, and was handed over to me. After the recovery completed I restored the latest controlfile from Primary and mounted the standby. At this point the controlfile was with the information of the filesystem as they were on the Primary side. The next step was to register everything we had on Standby side:

[oracle@oracle3:~ (db)]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Thu Jan 8 04:57:01 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ABCD (DBID=1463580380, not open)

RMAN> catalog  start with '+DATA/adbc_oracle3/DATAFILE/';

searching for all files that match the pattern +DATA/adbc_oracle3/DATAFILE/

List of Files Unknown to the Database
=====================================
File Name: +data/adbc_oracle3/DATAFILE/TEST.256.844395985
File Name: +data/adbc_oracle3/DATAFILE/TEST.257.844397067
.................
.................
File Name: +data/adbc_oracle3/DATAFILE/TEST.416.865953683

Do you really want to catalog the above files (enter YES or NO)? YES
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: +data/adbc_oracle3/DATAFILE/TEST.256.844395985
File Name: +data/adbc_oracle3/DATAFILE/TEST.257.844397067
................
...............
File Name: +data/adbc_oracle3/DATAFILE/TEST.416.865953683

RMAN> report schema;

RMAN-06139: WARNING: control file is not current for REPORT SCHEMA
Report of database schema for database with db_unique_name ABCD_ORACLE3

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    0        SYSTEM               ***     /oradata/abcd_oracle2/datafile/o1_mf_system_825xkscr_.dbf
2    0        SYSAUX               ***     /oradata/abcd_oracle2/datafile/o1_mf_sysaux_825y451r_.dbf
3    0        TEST                 ***     /oradata/abcd_oracle2/datafile/o1_mf_test_825s84mw_.dbf
4    0        TEST                 ***     /oradata2/abcd_oracle2/datafile/o1_mf_test_8dr1v332_.dbf
................
................
................
147  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_b8k8hcrh_.dbf
148  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_b8k8hdhf_.dbf
149  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_b8k8hf6o_.dbf
150  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_b8k8hg1j_.dbf
151  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_bb318bhs_.dbf
152  0        TEST                   ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_bb318cff_.dbf
153  0        TEST_INDEX             ***     /oradata4/abcd_oracle2/datafile/o1_mf_test_index_bb318pmy_.dbf
154  0        TEST_NOLOGGING         ***     /oradata3/abcd_oracle2/datafile/o1_mf_test_nolog_bbm2s7vk_.dbf
155  0        TESTINDEX             ***     /oradata3/abcd_oracle2/datafile/o1_mf_test_index_bbm2z7nv_.dbf
156  0        PERFSTAT             ***     /oradata3/abcd_oracle2/datafile/o1_mf_perfstat_bbm312pf_.dbf

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
3    15104    TEMP                 32767       /oradata4/abcd_oracle2/datafile/o1_mf_temp_b633ppbr_.tmp
4    25600    TEMP                 32767       /oradata4/abcd_oracle2/datafile/o1_mf_temp_b633ppcf_.tmp

After catalog completed, it was time to switch database to copy and it failed with below error

RMAN> switch database to copy;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of switch to copy command at 01/08/2015 05:00:25
RMAN-06571: datafile 151 does not have recoverable copy

After some analysis found, it was due to missing datafile on standby. The datafile was created on Primary after the standby was down and the recovery using the incremental backup was done with the older controlfile which had no information about the new datafiles.

Datafile # 151 – 156 were missing, so cataloged the backup pieces again as the controlfile was restored, and started restoring the datafile

RMAN> restore datafile 151;

Starting restore at 08-JAN-15
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00151 to /oradata4/ABCD_ORACLE2/datafile/o1_mf_ct_bb318bhs_.dbf
channel ORA_DISK_1: reading from backup piece +BACKUP/ABCD_ORACLE3/backupset/restore/incr_standby_5sps58rd_1_1

channel ORA_DISK_1: piece handle=+BACKUP/ABCD_ORACLE3/backupset/restore/incr_standby_5sps58rd_1_1 tag=TAG20150107T192825
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:19:05
Finished restore at 08-JAN-15

RMAN>

After the restoration completed, report schema showed

RMAN> report schema;

RMAN-06139: WARNING: control file is not current for REPORT SCHEMA
Report of database schema for database with db_unique_name adbc_oracle3

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    0        SYSTEM               ***     /oradata/ABCD_ORACLE2/datafile/o1_mf_system_825xkscr_.dbf
2    0        SYSAUX               ***     /oradata/ABCD_ORACLE2/datafile/o1_mf_sysaux_825y451r_.dbf
.................
151  30720    TEST                  ***     +DATA/adbc_oracle3/datafile/test.417.868424659 
152  30720    TEST                  ***     +DATA/adbc_oracle3/datafile/test.418.868424659 
................

Tried to perform “switch database to copy” which again failed with the same error “RMAN-06571: datafile 151 does not have recoverable copy” . At this point I though to use “switch datafile to copy” for which generated dynamic sql from primary and ran in to standby.

Generated switch command sql from Primary :-

SQL> select 'switch datafile '||file#||' to copy;' from v$datafile;

[oracle@oracle3:~ (db)]$ vi swtch_copy.rman
[oracle@oracle3:~ (db)]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Thu Jan 8 06:09:46 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ABCD (DBID=1463580380, not open)

RMAN> @swtch_copy.rman

RMAN> switch datafile 2 to copy;
using target database control file instead of recovery catalog
datafile 2 switched to datafile copy "+DATA/adbc_oracle3/datafile/sysaux.277.844418721"

............................
...........................
...........................
RMAN> switch datafile 149 to copy;
datafile 149 switched to datafile copy "+DATA/adbc_oracle3/datafile/ct.415.865953681"

RMAN> switch datafile 156 to copy;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of switch to copy command at 01/08/2015 06:28:54
RMAN-06571: datafile 156 does not have recoverable copy

RMAN>
RMAN> **end-of-file**

Performed ‘recover database noredo’ again and this time it was pretty quick and then tried recover standby database

[oracle@oracle3:~/working/anand (abcd)]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jan 8 06:31:38 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

SQL> recover standby database;
ORA-00279: change 38328244436 generated at 01/07/2015 19:28:32 needed for thread 1
ORA-00289: suggestion : +FRA
ORA-00280: change 38328244436 for thread 1 is in sequence #98501


Specify log: {=suggested | filename | AUTO | CANCEL}
^C
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '+DATA/adbc_oracle3/datafile/system.357.844454911'


Did few changes in DGMGRL parameters and enabled standby configuration. After few hours, standby was insync with Primary. Stopped the MRP , dropped the standby
redo logfiles as they showed filesystem, and created on ASM. Opened the standby in read only mode and started the MRP.