CDC March 15 2008
Change Data Capture (CDC) a good way for capturing changing in source database (ETL)
Assume that you are using ETL to load a data warehouse and that your source data is the familiar EMP table
This article will show you steps based on an oracle Feature called Change Data Capture (CDC) that has been around since oracle9i that will automatically track changes of the EMP source table
Advantages of this solution
it is based on Publisher / Subscriber paradigm
It allows for incremental capturing, for example, the subscriber can only see changes since last night's load
Source Table EMP
Publisher Schema : SCOTT
Note: The publisher does not really need to be the owner of the source table
The publisher needs to have the SELECT_CATALOG_ROLE and the EXECUTE_CATALOG_ROLE
SQL> GRANT SELECT_CATALOG_ROLE TO SCOTT;
SQL> GRANT EXECUTE_CATALOG_ROLE TO SCOTT;
CREATE CHANGE TABLE
a change table needs to be created by the publisher, this table shall accommodate the changed data
The following command needs to be issued by the Publisher (note the dash (-) at the end of each is just to indicate the the command is not yet finished and will continue on the next line
Another way of the running the command is of course by creating a PL/SQL block and remove the EXECUTE key word and remove all the dashes
--BEGIN
DBMS_LOGMNR_CDC .....
-- END;
SQL> EXECUTE DBMS_LOGMNR_CDC_PUBLISH.CREATE_CHANGE_TABLE ( -
> OWNER => 'SCOTT', -
> CHANGE_TABLE_NAME => 'CDC_EMP', -
> CHANGE_SET_NAME => 'SYNC_SET', -
> SOURCE_SCHEMA => 'SCOTT', -
> SOURCE_TABLE => 'EMP', -
> COLUMN_TYPE_LIST => 'EMPNO NUMBER, -
> ENAME VARCHAR2(20), -
> DEPTNO NUMBER, -
> SAL NUMBER ', -
> CAPTURE_VALUES => 'BOTH', -
> RS_ID => 'N', -
> ROW_ID => 'N', -
> USER_ID => 'Y', -
> TIMESTAMP => 'Y', -
> OBJECT_ID => 'N', -
> SOURCE_COLMAP => 'N', -
> TARGET_COLMAP => 'N', -
> OPTIONS_STRING => NULL);
PL/SQL procedure successfully completed.
At this point, the source, the publisher and the change data have been identified
SUBSCRIBER : Target Schema
in our case I have a schema called AMMAR and is already created
Publisher gives select permission to subscriber
SQL> connec scott/tiger
Connected.
SQL> grant select on emp to ammar;
Grant succeeded.
SQL> grant select on cdc_emp to ammar;
Grant succeeded.
(note : if for any reason you want to drop the change_Table , you must user the
exec dbms_cdc_publish.drop_change_table('SCOTT','CDC_EMP','Y');
after recreating the change table do not forget to execute the GRANTS shown above again
Subscribe to source table and activate subscription
Subscriber must issue the following
declare
sub_handle number;
begin
-- CREATE SUBSCRIPTION HANDLE
DBMS_LOGMNR_CDC_SUBSCRIBE.GET_SUBSCRIPTION_HANDLE(
CHANGE_SET=>'SYNC_SET',
DESCRIPTION=>'Change data from EMP table ',
SUBSCRIPTION_HANDLE=>sub_handle);
dbms_output.put_line(sub_handle);
-- SUBSCRIBE TO PUBLISHER SCHEMA AND TABLE
DBMS_LOGMNR_CDC_SUBSCRIBE.SUBSCRIBE(
SUBSCRIPTION_HANDLE=>sub_handle,
SOURCE_SCHEMA=>'SCOTT',
SOURCE_TABLE=>'EMP',
COLUMN_LIST=>'EMPNO, ENAME, DEPTNO ,SAL');
--ACTIVATE SUBSCRIPTION
DBMS_LOGMNR_CDC_SUBSCRIBE.ACTIVATE_SUBSCRIPTION(
SUBSCRIPTION_HANDLE=>SUB_HANDLE);
END;
If you get an error like no publication found, make sure that the subscriber has access to the source (check the GRANTS)
simulate any activity at the EMP table at this point
The Source changes are captured here
Window:
EXECUTE DBMS_LOGMNR_CDC_SUBSCRIBE.EXTEND_WINDOW( SUBSCRIPTION_HANDLE=>6);
n Think of this command as if a snapshot of source data is taken at this stage. Changes after this command will not be seen, until you purge the window and extend the window again as will be shown later
This prepares a view for the subscriber through which he/she can access the changes
Subscriber View
the subscriber view will allow the subscriber to view the changes using a view name that is created by the following command. There is not need to access the actual change table
EXECUTE DBMS_LOGMNR_CDC_SUBSCRIBE.PREPARE_SUBSCRIBER_VIEW(SUBSCRIPTION_HANDLE =>6,-
SOURCE_SCHEMA=>'SCOTT', -
SOURCE_TABLE=>'EMP', -
VIEW_NAME => :sv_price_list);
SQL> select :sv_price_list from dual;
:SV_PRICE_LIST
--------------------------------
CDC#CV$663198
SQL> select * from CDC#CV$663198;
OP CSCN$ COMMIT_TI TIMESTAMP USERNAME$ DEPTNO EMPNO ENAME SAL
--- ---------- --------- --------- ------------------ ------------ ------ -------------- -------------- ------------- --------
I 5036988 15-MAR-08 15-MAR-08 SCOTT 10 7554 AMMAR 1200
I 5036988 15-MAR-08 15-MAR-08 SCOTT 10 7544 SAJDI 1500
Note the OP is the operation (I) is an Insert operation
Drop the subscriber View and purge the window and the cycle is over, when you extent the window again , you shall capture the increment after this cycle
EXECUTE -
DBMS_LOGMNR_CDC_SUBSCRIBE.DROP_SUBSCRIBER_VIEW( -
SUBSCRIPTION_HANDLE =>6, -
SOURCE_SCHEMA=>'SCOTT', -
SOURCE_TABLE=>'EMP');
EXECUTE -
DBMS_LOGMNR_CDC_SUBSCRIBE.PURGE_WINDOW( -
SUBSCRIPTION_HANDLE=>6);
SQL> select * from CDC#CV$663198;
no rows selected
Here you can repeat the Window and Subscriber View again to view the next increment