CDC                        March 15 2008

Change Data Capture (CDC)  a good way for capturing changing in source database (ETL)

Assume that you are using ETL to load a data warehouse and that your source data is the familiar EMP table

This article will show you steps based on an oracle Feature called Change Data Capture (CDC) that has been around since oracle9i that will automatically track changes of the EMP source table

Advantages of this solution

it is based on Publisher / Subscriber paradigm

It allows for incremental capturing, for example, the subscriber can only see changes since last night's load

 

Source Table  EMP

Publisher  Schema :  SCOTT

Note: The publisher does not  really need to be the owner of the source table

The publisher needs to have the SELECT_CATALOG_ROLE  and the EXECUTE_CATALOG_ROLE

 

SQL> GRANT SELECT_CATALOG_ROLE TO SCOTT;

SQL>  GRANT EXECUTE_CATALOG_ROLE TO SCOTT;

 

 

CREATE CHANGE TABLE

a change table needs to be created by the publisher, this table shall accommodate the changed data

 

The following command needs to be issued by the Publisher   (note the dash (-) at the end of each is just to indicate the the command is not yet finished and will continue on the next line

 

Another way of the running the command is of course by creating a PL/SQL block and remove the EXECUTE key word and remove all the dashes

--BEGIN

     DBMS_LOGMNR_CDC   .....

-- END;

 

SQL> EXECUTE DBMS_LOGMNR_CDC_PUBLISH.CREATE_CHANGE_TABLE ( -

> OWNER => 'SCOTT', -

> CHANGE_TABLE_NAME => 'CDC_EMP', -

> CHANGE_SET_NAME => 'SYNC_SET', -

> SOURCE_SCHEMA => 'SCOTT', -

> SOURCE_TABLE => 'EMP', -

> COLUMN_TYPE_LIST => 'EMPNO  NUMBER, -

>       ENAME VARCHAR2(20), -

>       DEPTNO NUMBER, -

>       SAL           NUMBER ',  -

> CAPTURE_VALUES => 'BOTH', -

> RS_ID => 'N', -

> ROW_ID => 'N', -

> USER_ID => 'Y', -

> TIMESTAMP => 'Y',  -

> OBJECT_ID => 'N', -

> SOURCE_COLMAP => 'N', -

> TARGET_COLMAP => 'N', -

> OPTIONS_STRING => NULL);

 

PL/SQL procedure successfully completed.

At this point, the source, the publisher and the change data have been identified 

 

SUBSCRIBER :  Target Schema

in our case I have a schema called AMMAR and is already created

 Publisher gives select permission to subscriber

 SQL> connec scott/tiger

Connected.

SQL> grant select on emp to ammar;

 Grant succeeded.

 SQL> grant select on cdc_emp to ammar;

Grant succeeded.

 

 

(note : if for any reason you want to drop the change_Table , you must user the

exec dbms_cdc_publish.drop_change_table('SCOTT','CDC_EMP','Y');

 

after recreating the change table do not forget to execute the GRANTS shown above again

 Subscribe to source table and activate subscription

Subscriber must issue the following

 

declare

   sub_handle   number;

begin

-- CREATE SUBSCRIPTION HANDLE

DBMS_LOGMNR_CDC_SUBSCRIBE.GET_SUBSCRIPTION_HANDLE(

    CHANGE_SET=>'SYNC_SET',

    DESCRIPTION=>'Change data from EMP table ',

    SUBSCRIPTION_HANDLE=>sub_handle);

dbms_output.put_line(sub_handle);

 

-- SUBSCRIBE TO PUBLISHER SCHEMA AND TABLE

DBMS_LOGMNR_CDC_SUBSCRIBE.SUBSCRIBE(

    SUBSCRIPTION_HANDLE=>sub_handle,

    SOURCE_SCHEMA=>'SCOTT',

    SOURCE_TABLE=>'EMP',

COLUMN_LIST=>'EMPNO, ENAME, DEPTNO ,SAL');

 

--ACTIVATE SUBSCRIPTION

 

DBMS_LOGMNR_CDC_SUBSCRIBE.ACTIVATE_SUBSCRIPTION(

    SUBSCRIPTION_HANDLE=>SUB_HANDLE);

END;

  

If you get an error like  no publication found, make sure that the subscriber has access to the source (check the GRANTS)

simulate any activity at the EMP table at this point

The Source changes are captured here

 

 

Window:

 EXECUTE DBMS_LOGMNR_CDC_SUBSCRIBE.EXTEND_WINDOW(   SUBSCRIPTION_HANDLE=>6);

n      Think of this command as if a snapshot of source data is taken at this stage.  Changes after this command will not be seen, until you purge the window and extend the window again as will be shown later

 This prepares a view for the subscriber through which he/she can access the changes

Subscriber View

 

the subscriber view will allow the subscriber to view the changes using a view name that is created by the following command.  There is not need to access the actual change table

 

EXECUTE DBMS_LOGMNR_CDC_SUBSCRIBE.PREPARE_SUBSCRIBER_VIEW(SUBSCRIPTION_HANDLE =>6,-

 SOURCE_SCHEMA=>'SCOTT', -

 SOURCE_TABLE=>'EMP', -

 VIEW_NAME => :sv_price_list);

 

 SQL> select :sv_price_list from dual;

 

 :SV_PRICE_LIST

--------------------------------

CDC#CV$663198

 

SQL> select * from CDC#CV$663198;

 

          OP          CSCN$     COMMIT_TI        TIMESTAMP     USERNAME$   DEPTNO     EMPNO     ENAME       SAL

          ---             ----------       --------- ---------     ------------------     ------------ ------     -------------- --------------    -------------    --------

 I         5036988         15-MAR-08     15-MAR-08         SCOTT               10                 7554         AMMAR      1200

 I         5036988          15-MAR-08    15-MAR-08         SCOTT               10                 7544         SAJDI          1500

 

Note the OP is the operation (I) is an Insert operation 

 

Drop the subscriber View and purge the window and the cycle is over, when you extent the window  again , you shall capture the increment after this cycle

 

 

EXECUTE -

DBMS_LOGMNR_CDC_SUBSCRIBE.DROP_SUBSCRIBER_VIEW( -

    SUBSCRIPTION_HANDLE =>6, -

    SOURCE_SCHEMA=>'SCOTT', -

    SOURCE_TABLE=>'EMP');

 

 

EXECUTE -

DBMS_LOGMNR_CDC_SUBSCRIBE.PURGE_WINDOW( -

    SUBSCRIPTION_HANDLE=>6);

 

SQL> select * from CDC#CV$663198;

 

no rows selected

 

Here you can repeat the Window and Subscriber View again to view the next increment