Thursday, August 5, 2010

Performance Improvement - Part 3: Correlated Update tuning

Performance Improvement - Part 3: Correlated Update tuning:

We have seen the small difference in correlated update statement execution in the previous Blog. Please refer below link if need to check it.


http://karthikeyanbaskaran.blogspot.com/2010/06/difference-in-correlated-update.html

The scenario is, need to update the “Address” field in table”Main” from the “Sub” table using the “id” field. The -1 needs to update in all other columns. The below update method solve it in single statement.

update main mset address = nvl((select address from sub s where m.id=s.id),-1);
Correlated update statement process for all the rows in the table “MAIN”. We see how it works when we load the lot records in the table “MAIN” and performance improvement method.


SQL> insert into main
2 select * from main;
............
.......
....
SQL> insert into main
2 select * from main;

Now the table “Main” is having the 1048576 records and table “Sub” is having the 3 records.

SQL> select count(*) from main;

COUNT(*)
---------

1048576

SQL> select count(*) from sub;

COUNT(*)
----------

3
The number of distinct address data is same now and we start our analysis process.

SQL> select address,count(*) from main
2 group by address;

ADDRESS COUNT(*)
---------- ----------
NNNNNNN 1048576

The below update statement works very well for our scenario. But every time it updates for the 1048576 records.

All the time it updates the 1048576 records, even if there are less difference needs to update it in the table “Main” from “Sub”.

SQL> set timing on
SQL> set autotrace traceonly
SQL> update main m
2 set address = nvl((select address from sub s where m.id=s.id),-1);

1048576 rows updated.

Elapsed: 00:01:44.04
Execution Plan
----------------------------------------------------------
0 UPDATE STATEMENT Optimizer=CHOOSE
1 0 UPDATE OF 'MAIN'
2 1 TABLE ACCESS (FULL) OF 'MAIN'
3 1 TABLE ACCESS (FULL) OF 'SUB'

Statistics
----------------------------------------------------------
805 recursive calls
1073721 db block gets
2659 consistent gets
2124 physical reads
252412780 redo size
618 bytes sent via SQL*Net to client
582 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
1048576 rows processed


The statement took 01:44 minutes to complete and it generated the 252412780 bytes. We roll back the results.


SQL> rollback;

Rollback complete.

Elapsed: 00:02:26.09

Performance improved script:

The WHERE condition has added to update statement and it updates the records which is difference comparing with Table “Main” and “Sub”.

SQL> update main m
2 set address = nvl((select address from sub s where m.id=s.id),-1)
3 where address is null
4 or address <> nvl((select address from sub s where m.id=s.id),-1);

1048576 rows updated.

Elapsed: 00:01:42.05

Execution Plan
----------------------------------------------------------
0 UPDATE STATEMENT Optimizer=CHOOSE
1 0 UPDATE OF 'MAIN'
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'MAIN'
4 2 TABLE ACCESS (FULL) OF 'SUB'
5 1 TABLE ACCESS (FULL) OF 'SUB'

Statistics
----------------------------------------------------------
536 recursive calls
1073630 db block gets
2611 consistent gets
1963 physical reads
252411468 redo size
627 bytes sent via SQL*Net to client
672 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1048576 rows processed


The statement took 01:42 minutes to complete and it generated the 252411468 bytes. There is no big difference when it updates all the records.

SQL> select address,count(*) from main
2 group by address;

ADDRESS COUNT(*)
---------- ----------
-1 524288
A 262144
C 262144

Elapsed: 00:00:01.03
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00

We change the some set of records to check the execution time.

SQL> update sub set address='M' where id =4;

1 row updated.

Elapsed: 00:00:00.02
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00
SQL> set autotrace traceonly
SQL>
SQL> update main m
2 set address = nvl((select address from sub s where m.id=s.id),-1);
1048576 rows updated.

Elapsed: 00:01:51.00

Execution Plan
----------------------------------------------------------
0 UPDATE STATEMENT Optimizer=CHOOSE
1 0 UPDATE OF 'MAIN'
2 1 TABLE ACCESS (FULL) OF 'MAIN'
3 1 TABLE ACCESS (FULL) OF 'SUB'




Statistics
----------------------------------------------------------
624 recursive calls
1072662 db block gets
4953 consistent gets
2313 physical reads
248380252 redo size
629 bytes sent via SQL*Net to client
584 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1048576 rows processed



The statement took 01:51minutes to complete and it generated the 248380252 bytes.


SQL> set autotrace off;
SQL> select address,count(*) from main
2 group by address;

ADDRESS COUNT(*)
---------- ----------
-1 524288
A 262144
M 262144

Elapsed: 00:00:00.09

Rollback the result.

SQL> rollback;

Rollback complete.

Elapsed: 00:04:40.08
SQL> select address,count(*) from main
2 group by address;

ADDRESS COUNT(*)
---------- ----------
-1 524288
A 262144
C 262144


Performance improved script:

SQL> set autotrace traceonly

SQL> update main m
2 set address = nvl((select address from sub s where m.id=s.id),-1)
3 where address is null
4 or address <> nvl((select address from sub s where m.id=s.id),-1);

262144 rows updated.

Elapsed: 00:00:45.04

Execution Plan
----------------------------------------------------------
0 UPDATE STATEMENT Optimizer=CHOOSE
1 0 UPDATE OF 'MAIN'
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'MAIN'
4 2 TABLE ACCESS (FULL) OF 'SUB'
5 1 TABLE ACCESS (FULL) OF 'SUB'




Statistics
----------------------------------------------------------
176 recursive calls
268198 db block gets
2421 consistent gets
2054 physical reads
62104336 redo size
628 bytes sent via SQL*Net to client
672 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
262144 rows processed


The statement took 00:45.04 seconds to complete and it generated the 62104336 bytes.

SQL> set autotrace off;
SQL> select 248380252-62104336 redo_diff from dual;

REDO_DIFF
----------
186275916

The second update statement is avoiding the 186275916 bytes of the redo. So it is saving space in the UNDO and space in archive folders.

This method gives the good performance if it updates some part records in table.


0 comments:

Post a Comment