Monday, July 19, 2010

Performance Improvement - Part 2: Hard parse is not bad

Performance Improvement - Part 2: Hard parse is not bad

In previous post, we have seen the soft parsing is doing less operation comparing with hard parsing.

Please have a look on below post before read this one.

http://karthikeyanbaskaran.blogspot.com/2010/07/performance-improvement-part-1-parsing.html

The Hard parse includes the parse, optimize, generation of the plan for the query and soft parse skips optimize and generation of the plan. So we decided that hard parse is not good for application performance.

Some times soft parsing is having performance issue. The below scripts proves that.
SQL@ORA10G> create table t
2 as
3 select case when rownum = 1
4 then 1 else 99 end id, a.*
5 from all_objects a;

Table created.

SQL@ORA10G> create index t_idx on t(id);

Index created.


SQL@ORA10G> analyze table t compute statistics;

Table analyzed.



The WHERE ID=1 condition will return one record and WHERE ID=99 will return all of the rest (about 40,000 records). Also, the optimizer is very aware of this fact, we can definitely see different plans for different inputs, as we are passing the literals in the WHERE clause.


SQL@ORA10G> set autotrace traceonly explain
SQL@ORA10G> select * from t where id = 1;

Execution Plan
----------------------------------------------------------


0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=141)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=1 Bytes=141)
2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=1)

SQL@ORA10G> select * from t where id = 99;

Execution Plan
----------------------------------------------------------


0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=137 Card=34323 Bytes=4839543)
1 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=137 Card=34323 Bytes=4839543)






From Oracle9i Database Release 1 through Oracle Database 10g Release 2, Oracle Database will wait until the cursor is opened to do the actual optimization of the query—it will wait for the bind variable value to be supplied by the application before figuring out the right way to optimize the query. This is called bind variable peeking, when the optimizer first looks at the bind values and then optimizes the query.



In this case, however, depending on which inputs are used to first run the query, the database will either choose a full scan or an index range scan plus table access by index rowid. And in Oracle9i Database Release 1 through Oracle Database 10g Release 2, that is the plan that will be used to execute the SELECT * FROM t WHERE ID = :ID query, regardless of the subsequent bind values, until the query is hard-parsed and optimized again.

SQL@ORA10G> exec :id := 99

PL/SQL procedure successfully completed.

SQL@ORA10G> select * from t where id = :id;

Execution Plan
----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=137 Card=20391 Bytes=1814799)

1 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=137 Card=20391 Bytes=1814799)





So we started off with ID=99 as the bind, and the optimizer chose a full scan. Therefore, regardless of the bind value, the database will execute a full scan from now on. For example:
SQL@ORA10G> exec :id :=1

PL/SQL procedure successfully completed.

SQL@ORA10G> select * from t where id = :id;

Execution Plan
----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=137 Card=20391 Bytes=1814799)

1 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=137 Card=20391 Bytes=1814799)




The above result demonstrates that it is not going for the index range scan or table access by index rowid. In this case, the logical I/Os (Consistent gets) are more because of the FULL table scan instead of the index scan.

The result proves the “poorly performing query”.

Tuesday, July 13, 2010

Performance Improvement - Part 1: Parsing

Performance Improvement - Part 1: Parsing

In this article, we see how to improve the performance of the SQL queries. Before that we should know about the parsing and types of that.

Parsing:

This is the first step in the processing of any statement in Oracle. Parsing is the process of breaking the submitted statement down into its component parts. Determining what type of statement it is (whether Query, DML, DDL) and performing various checks on it.
The parsing process performs two main functions:

· Syntax Check
· Semantic Analysis

Syntax Check:

The syntax check performs the below operation.

Is the statement a valid one?
Does it make sense given the SQL grammar documented in the SQL Reference Manual? Does it follow all of the rules for SQL?

Semantic Analysis:
The semantic going beyond the syntax and it is performing the below operation.

Is the statement valid in light of the objects in the database (do the tables and columns referenced exist)?
Do you have access to the objects?
Are the proper privileges in place?
Are there ambiguities in the statement?
Soft parsing and hard parsing:
The next step in the parse operation is to see if the statement we are currently parsing has already in fact been processed by some other session. We may be in luck here If it has, we can skip the next two steps in the process, that is

1 .Optimization
2. Row source generation.

If we can skip these next two steps in the process, we have done what is known as a Soft Parse.


If we cannot, if we must do all of the steps, we are performing what is known as a Hard Parse. Hard parse includes the parse, optimize, generation of the plan for the query.
Example for Hard Parse:
The below scripts demonstrate the hard parsing.

SQL> ALTER SYSTEM FLUSH SHARED_POOL;

System altered.

SQL> create table t
(
id number(2),
Des varchar2(20)
);

Table created.

insert into t values(1,'AA');
insert into t values(2,'BB');
insert into t values(3,'CC');
insert into t values(4,'DD');
insert into t values(5,'EE');
insert into t values(6,'FF');


SQL> select * from t;

ID DES
--- --------
1 AA
2 BB
3 CC
4 DD
5 EE
6 FF

6 rows selected.


SQL> select SQl_TEXT,executions,hash_value,child_latch,child_address from v$sql where sql_text like 'select * from t where id %';

no rows selected

SQL> select * from t where id = 2;

ID DES
--- --------
2 BB

SQL> select * from t where id = 2;

ID DES
--- --------
2 BB

SQL> select * from t where id = 3;

ID DES
--- --------
3 CC

SQL> select SQl_TEXT,executions,hash_value,child_latch,child_address from v$sql where sql_text like 'select * from t where id %';





The above result is showing that, it is creating the different hash value for “where id = 2” and “where id =3 “. That means it is going for the hard parse for the each time when we change the condition.

Example for Soft parse:

The Execution column is showing 2 for the “where id =2 “. First time it has gone for the hard parse and second time it is gone for the soft parse.

How to make soft parse for “where id =2 “and “where id =3 “. We use the bind variable and it solves the issue.



SQL> variable id number
SQL> exec :id := 1;

PL/SQL procedure successfully completed.

SQL> select * from t where id = :id;

ID DES
--- --------
1 AA

SQL> exec :id :=2 ;

PL/SQL procedure successfully completed.

SQL> select * from t where id = :id;

ID DES
--- ------
2 BB

SQL> exec :id :=2 ;

PL/SQL procedure successfully completed.

SQL> select * from t where id = :id;

ID DES
--- -------
2 BB


SQL> select SQl_TEXT,executions,hash_value,child_latch,child_address from v$sql where sql_text like 'select * from t where id %';


It is not creating the different plan for each time when we use the bind variable. The hash value vale has created for the script “where id = :id “ and the value of the variable (:id) is changing each time. Refer the “Execution field, It is showing 3 for the different values.
Hard parsing is very CPU intensive. If our applications we want a very high performance then our queries to be Soft Parsed (to be able to skip the optimize/generate phases)

If we have to Hard Parse a large percentage of our queries, our system will function slowly and in some cases not at all.

Now we have a strong proof for the soft parsing and that improve the performance. But soft parsing is not always good and hard parsing is not always bad. We see that in the next Blog.

Monday, July 5, 2010

Reusable scripts - Part1

Find the letter occurrences in the statement:

The below query is finding the number of occurrence letter “a” in the string.

select length('Find a letter in a string') - length(replace('Find a letter in a string', 'a','')) as "Number of occurrence" from dual ;

Number of occurrence
--------------------
2


Remove the alphabetic letters in the given string:

declare
l_start number;
len number;
i number;
p_str varchar2(1000) := 'aa123yksjds45';
p_str1 varchar2(1000);
res varchar2(1000);
begin
i:=0;
len :=length(p_str);
WHILE i <= len

loop
len := ASCII(substr(p_str,i,1));
IF len > 47 AND len < 58 then
i:= i+1;
else
p_str := replace(p_str,substr(p_str,i,1),'');
END IF;
end loop;
dbms_output.put_line(p_str);
end;

Result:

12345

Divide the given comma separated words:

declare v_comma_position number;
v_len number;
p_comma_val varchar2(1000) := 'value1,value2,value3,value4,value5'; p_comma_tmp varchar2(1000);
v_result varchar2(1000);
begin p_comma_val := p_comma_val ',';
v_len :=length(p_comma_val);
WHILE v_len >= 0
loop
v_comma_position := instr(p_comma_val,',')+1;
p_comma_tmp := '';v_result := '';
p_comma_tmp := p_comma_val;
p_comma_val:='';
v_result := substr(p_comma_tmp,1,v_comma_position-2);
p_comma_val := substr(p_comma_tmp,v_comma_position,v_len);
v_len := v_len - v_comma_position;
dbms_output.put_line( v_result);
end loop;
end;

Result:

value1
value2
value3
value4
value5


Monday, June 28, 2010

creating the composite index with more than 900 bytes



Creating the composite index with more than 900 bytes:
Sql server is having some constraints while creating the index. The index entry will not allow if the data size is more than 900 bytes. It gives the error while inserting the data.

There is a way to insert the more than 900 bytes into composite index. we see that in this article.

Intial setup:

Creating the table with two columns and total size is more than 900 bytes.

use tempdb
create table tmp
(
c1 varchar(10),
c2 varchar(1000)
)


Index creation:

The create index statement gives the warnings. Because total size is havng more than 900 bytes.

create index idx_tmp on tmp(c1,c2)

Warning! The maximum key length is 900 bytes. The index 'idx_tmp' has maximum length of 1010 bytes. For some combination of large values, the insert/update operation will fail.

Test data Length:
select len('

test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
')





Insert statement with more than 900 bytes:

The toltal size of the insert is 1010 bytes here. It is not allowing to insert and giving the error.


insert into tmp values( 'hai',
'test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data'
)

INCLUDE clause:

There is another way to avoid this error and create a index on both the columns.

drop index idx_tmp on tmp

By using the INCLUDE clause of the CREATE INDEX statement, the index key could be defined as (c1) and c2 defined as a nonkey column. In this way, the index key size would be 10 bytes.


CREATE INDEX idx_tmp2
ON tmp (c1)
INCLUDE (c2);

Insert statement with more than 900 bytes:

The toltal size of the insert is 1010 bytes here. This time it is allowing to insert

insert into tmp values( 'hai',
'test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data'
)



The index has included both the columns. The below query proves that.

select b.name index_name,a.name column_name from sys.columns a
inner join sys.indexes b
on a.object_id=b.object_id
where b.name ='idx_tmp2'





Addition to that we verify the explain plan. It goes for the index scan.
select * from tmp where c1='10'


select * from tmp where c2='10'



Tuesday, June 22, 2010

Good to Know in SQL server 2008

Good to Know in SQL server 2008

1. Insert values:

Many times we use “INSERT INTO tab_name VALUES (….) “way for inserting the data into table. There is a simple way in the SQL server 2008 to avoid multiple INSERT clauses for more than one record. It is possible in single statement.


use tempdb

Create table Dept(Id int, DName varchar(100))

Insert into Dept
values
(1,'A'),
(2,'B'),
(3,'C'),
(4,'D')

select * from Dept




2.Declare and initiate in same lines:

Old style ( SQL 2000 and 2005) :

declare @i int
set @i = 10
declare @dt datetime
set @dt = GETDATE()
select @i as Col1, @dt as Col2




New Style (SQL server2008):

declare @i int = 10
declare @dt datetime = getdate()
select @i as Col1, @dt col2
go

Monday, June 21, 2010

Pivot in SQL server

PIVOT:

PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output.

CREATE TABLE Sales (Make varchar(8), Year int, Sales int)
GO
Insert into Sales values ('Honda',2005,20000)
Insert into Sales values ('Honda',2005,60000)
Insert into Sales values ('Mahindra',2005,40000)
Insert into Sales values ('Honda',2006,30000)
Insert into Sales values ('Mahindra',2006,30000)
Insert into Sales values ('Mahindra',2006,60000)
Insert into Sales values ('Mahindra',2007,8000)
GO

Actual Group by, get Number of cars sold for each make in each year.
Select Make,YEAR,SUM(Sales) from Sales
group by Make,Year

Transposed of above query

Select * from Sales pivot(Sum(sales) for Year in ([2001],[2002],[2003],[2004],[2005],[2006],[2007])) t
go


Restricting the columns which we need to show.

Select * from Sales pivot(Sum(sales) for Year in ([2005],[2006],[2007])) t



List sum of each product sold for each year and product should have been sold all the years and not just 1 year.

Select * from Sales pivot(Sum(sales) for Year in ([2005],[2006],[2007])) t
where [2007] is not null


Thursday, June 10, 2010

Difference in Correlated Update

Difference in Correlated Update :



I came across the Correlated update statement with below format. It updates the address filed in the “Main” table and takes it from the “sub” table. The NVL function has given outside the sub query.

Outer NVL Query:
update main m
set address = nvl((select address from sub s where m.id=s.id),-1);

We can give the NVL function with the sub query (specifically for that column). Then what is the difference between “Outer NVL Query” and “Inner NVL Query”.
Inner NVL Query:
update main m
set address = (select nvl(address,-1) from sub s where m.id=s.id)

Initial setup:
This step creates the tables and populates it.

create table main ( id number, Address varchar2(10));

insert into main values (1,'NNNNNNN');
insert into main values (2,'NNNNNNN');
insert into main values (3,'NNNNNNN');
insert into main values (4,'NNNNNNN');

create table sub ( id number, Address varchar2(10));

insert into sub values (1,'A');
insert into sub values (2,null);
insert into sub values (4,'C');

select * from main;

select * from sub

Inner NVL Query:

update main m
set address = (select nvl(address,-1) from sub s where m.id=s.id);

It is applying the NVL function if NULL value comes from the “SUB” table. It is putting the NULL value in the “Main” table if it does not find the match in the “Sub” table.

We think that the outer NVL query works in the same way. But there is some difference.
Outer NVL Query:
update main m
set address = nvl((select address from sub s where m.id=s.id),-1)

We can see the difference in below result. It is applying the NVL function in the both the situation.

This is small Difference in the Correlated Update.