quinta-feira, 15 de setembro de 2022

Row Count + DBMS_COMPARISON - Logical Replication

Hi all,

I was in a project where logical replication was in place.

It was going from 11g non-exa to 19c on Exadata and they ran into some issues during their last migration attempt a year ago.

At that time OGG was sending data from 11g to 19c and another third-party tool was sending from 19c back to 11g (in case of rollback needed). The problem was that this tool is based on triggers to capture and DML commands to replicate, but you can imagine how bad it is when you have a large number of transactions and 2 database nodes.

Now, they decided to use OGG for everything and it was a great decision.

The customer asked me to validate the data (source vs destination) to make sure everything was in sync.

My first thought was to use OGG Veridata, but the license cost was an impediment to this project.

Last time, they used some procedures to do row counts at source and destination and compare the data. I got this code and worked for a few minutes to improve it. There were 16 procedures and about 30 tables and it’s now one procedure and a few tables. It’s not perfect and there is room to improve, but you can get an idea here.

As mentioned by Connor in this video, it’s not a regular situation and this kind of thing should be carefully analyzed before starting to count every row in every table.

Another point from the business area was to validate if the data was the same, not only the number of rows. For this case, I decided to use DBMS_COMPARISON.

Here you can see an example of how to generate the scripts to validate:

select '
BEGIN
DBMS_COMPARISON.drop_comparison (
    comparison_name    => ''cutover_comp_bm'');
END;
/

BEGIN
  DBMS_COMPARISON.create_comparison (
    comparison_name    => ''cutover_comp_bm'',
    schema_name        => ''YOUR_SCHEMA'',
    object_name        => '''||table_name||''',
    dblink_name        => ''db_compare'',
    remote_schema_name => ''YOUR_SCHEMA'',
    remote_object_name => '''||table_name||''');
END;
/


SET SERVEROUTPUT ON
DECLARE
  l_scan_info  DBMS_COMPARISON.comparison_type;
  l_result     BOOLEAN;
  v_comparison_name    varchar2(100):= ''cutover_comp_bm'';
BEGIN
  l_result := DBMS_COMPARISON.compare (
                comparison_name => v_comparison_name,
                scan_info       => l_scan_info,
                perform_row_dif => TRUE
              );

  IF NOT l_result THEN
    DBMS_OUTPUT.put_line(v_comparison_name||'' Differences found. scan_id='' || l_scan_info.scan_id);
  ELSE
    DBMS_OUTPUT.put_line(v_comparison_name||'' No differences found.'');
  END IF;
END;
/

' 
from dba_tables
where owner='YOUR_SCHEMA';

Some tables were really big in this database(billion of rows) and for this situation, I found the parameter scan_mode.

You can use something like:

    scan_mode          => dbms_comparison.CMP_SCAN_MODE_RANDOM,
    scan_percent       => 0.001    

DBMS_COMPARISON cannot compare LOB columns and you can use this parameter to limit the columns to be compared:

column_list        => 'YOUR COLUMNS SEPARATED BY COMMA'

You can use this SQL to generate a column list:

SELECT LISTAGG(column_name, ',') WITHIN GROUP (ORDER BY column_id)
  FROM dba_tab_columns
 WHERE owner = 'YOUR_OWNER'
   and table_name='YOUR_TABLE'
   and data_type not like '%LOB%';

For 21c, I found the CHECKSUM and I'm planning to take a look at it for future projects.

Thanks
Alex

0 comentários: