Extract(Read) > Transform(Test&Apply) > Load(Learn) Blog

March 21, 2008

Oracle Change Data Capture under magnifier – Async CDC with 10gR2

Filed under: Oracle How To,Oracle Information Integration — H.Tonguç Yılmaz @ 1:02 pm

CDC is perceived like an aspirin in my organization, so why is that? Simply here the need is sharing and integration of information among a multitude of applications and databases and CDC make this data flow. Another variation of this need is instead of taking full extractions and calculating the change during data warehouse’s ETL process, CDC enables just to extract the changes from source databases. I guess this is why CDC is in Data Warehousing Guide as Chapter 16 :)

cdc-for-dwh.png

Sync methods like trigger or materialized view log to capture the change on tables have important performance affects on source systems. Below is a simple Sync CDC demo with a Materialized View Log.

Code Listing 211a – Sync CDC demo with Materialized View Log

Oracle Streams(Logminer+Advanced Queues(AQ)) is the technology behind Async CDC, it automatically;
– captures database changes at a source database,
– stages the changes,
– propagates the changes to one or more destination databases,
– and then applies the changes at the destination database(s).

All these are handled including both DML and DDL changes. And Oracle Streams integration can be;
– Within an Oracle database,
– Between two Oracle databases,
– Among multiple Oracle databases,
– Between an Oracle database and a non-Oracle database.

In Oracle the committed changes to data that take place in a database are recorded in the database redo log files. The Streams capture process extracts these changes from the redo log files(or archived log files) and formats each change into a logical change record(LCR). The LCRs are then stored(staged) in a queue. Next, Streams propagates LCRs from one queue(the producer queue) to another(the consumer queue) automatically and you can then apply(or consume) the LCRs from the consumer queue to the destination database with custom pl/sql applications depending on your need or want.

After 10g Release 2, Asynchronous Distributed CDC can be used against Oracle 9i Release 2 and higher sources. And there is enhanced CDC PL/SQL Control APIs(dbms_cdc_subscribe, dbms_cdc_publish) support for Async CDC setup and configuration. WE have two options after 10gR2;

Option 1 – Async Distributed Hotlog CDC Flow
a. Database Objects Record Changes
b. LGWR Log Changes to Online Redo Log
c. Capture Process Enqueue LCRs on local Streams Queue
d. LCRs are Propagated to target Streams Queue
e. Dequeue LCRs from target Streams Queue and Apply Changes to target Database Objects

Option 2 – Async Autolog CDC Flow
a. Database Objects Record Changes
b. LGWR Log Changes to Online Redo Log
c. ARCn Write Redo Data to local Archived Log File
d. Archived Log File is copied to target
e. Capture Process Enqueue LCRs to target Streams Queue from transferred Archived Log File
e. Dequeue LCRs from target Streams Queue and Apply Changes to target Database Objects

Important prerequisites for Async CDC are;
a. Source database must be in ARCHIVELOG mode,
b. the change capture table must be altered and kept logging or at database level FORCE LOGGING must be enabled for NOLOGGING operations. FORCE LOGGING on tablespace level may be preferred in some cases.
c. Supplemental Logging is enabled at database level or table level;

-- To enable minimal supplemental logging
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
-- other supplemental logging
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (UNIQUE) COLUMNS;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (FOREIGN KEY) COLUMNS;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
-- to drop supplemental logging
ALTER DATABASE DROP SUPPLEMENTAL LOG DATA;
-- to control supplemental logging
SELECT supplemental_log_data_min, supplemental_log_data_pk,
supplemental_log_data_ui, supplemental_log_data_fk, supplemental_log_data_all
FROM v$database;

d. Async CDC requires at least version 9206 on source database,
e. Autolog Async option which reduces the mining and queuing performance impact on source requires Oracle and OS versions to be same,
f. some init.ora settings are mandatory, and GLOBAL_NAMES=TRUE is advised by documentation. But most probably this setting is not exceptable at source because you need to make sure the name of the database link matches the name of the target database exactly after this setting. As a workaround a database logon trigger can be used;

CREATE OR REPLACE TRIGGER cdc_trig
AFTER LOGON
ON DATABASE
DECLARE
sqlstr VARCHAR2(200) := 'ALTER SESSION SET GLOBAL_NAMES=TRUE';
BEGIN
IF (USER IN ('USR1', 'USR1')) THEN
execute immediate sqlstr;
END IF;
END cdc_trig;
/

Based on Mark Van de Wiel’s great Async CDC Cookbook here is Async Distributed Hotlog CDC option in action between a HP-UX B.11.23 U 10g Enterprise Edition Release 10.2.0.2.0 – 64bit as source and SunOS 5.10 – 10g Enterprise Edition Release 10.2.0.3.0 – 64bit as target.

Code Listing 211b – Async Distributed Hotlog CDC option in action

Before my final words here is a brief summary for the above demonstration;

PRE SETUP STEPS –
A – required database parameters are set on both databases,
B – cdc admin users are created with requested grants on both databases,
C – dblinks are created on both databases,
C – force logging and supplemental logging is enabled at source only.

SETUP STEPS –
A – create_hotlog_change_source step at target,
B – create_change_set step at target,
C – create_change_table step at target,
D – alter_change_set step at target,
E – alter_hotlog_change_source step at target,
F – create_subscription step at target.

POST SETUP –
A – change test at source,
B – capture control at target.

And my final notes are as follows;
a. CDC performance and restrictions are mostly related to Streams feature,
b. Direct path load and nologging statements on the operational source database is a threat which is usually a rare situation on a traditional OLTP environment,
c. There are documented bugs and patches on metalink, so testing Async CDC for your own need is highly recommended;
Bug 4285404 – Streams propogation from 9206
Bug 4649767 – High system commit numbers(SCNs)
Bug 5024710 – Queue tables in System Tablespace
d. do not throw a mountain to kill a scorpion :) If you are able to easily capture change information through your application(some modification date columns for example) I advice you to go for them first.

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0

References Used :

Asynchronous Change Data Capture Cookbook by Mr.Mark Van de Wiel – Feb 2006
Asynchronous Autolog Change Data Capture Cookbook by Mr.Mark Van de Wiel – Jan 2007
Mark Rittman’s article
Julian Dyke’s Streams and Supplemental Logging presentations
Streams and CDC demos at psoug.org
Streams related Oracle magazine article

3 Comments »

  1. Streams 101 Blog – http://wedostreams.blogspot.com/

    Comment by H.Tonguç Yılmaz — January 19, 2009 @ 6:19 am | Reply

  2. […] if you are interested in CDC option of Oracle please check this previous post of mine for an introduction and a Hotlog option implementation […]

    Pingback by Improvements with 11.1 for the datatypes and table structures supported for asynchronous change data capture option « H.Tonguç Yılmaz - Oracle Blog — January 26, 2009 @ 9:25 am | Reply

  3. This comment is a bit late, so I’m not sure if you will see it. However, you are blending the concepts of Streams with CDC… they are not the same. Using CDC does not automatically apply LCR’s to the target database… that’s what Streams does. CDC just makes the changes available to you, so you can do with them what you want. CDC is important in a data warehouse environment, because you don’t necessarily have an exact copy of the source schema to apply changes to. You are more likely applying these changes to a modified schema, either an ODS or a dimensional model. These changes will require the “T” in ETL… transformations.

    Good coverage of supplemental logging though. Nice post altogether.

    Stewart

    Comment by Stewart Bryson — June 18, 2010 @ 3:21 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: