Thursday, December 30, 2010

Simulating ORACLE ROWNUM in SQL Server

Simulating ORACLE ROWNUM in SQL Server:
Here is the approach you can use to generate a pseudo row number in SQL Server.

The table format and data:

select * from A

c1 c2
-----
101 0
102 0
103 0

(3 row(s) affected)


Approach 1:
select rank=count(*),a.c1,a.c2 from A a ,A b
where a.c1+a.c2 > = b.c1+b.c2
group by a.c1,a.c2


rank c1 c2
-----------
1 101 0
2 102 0
3 103 0

(3 row(s) affected)


Approach 2 :
select rank() OVER (ORDER BY a.c1,a.c2) as rank,a.c1,a.c2
from A a
order by rank

rank c1 c2
--------------------
1 101 0
2 102 0
3 103 0

(3 row(s) affected)


Finally very very simple method:
drop table #T
Select IDENTITY(int, 1,1) id, * INTO #T from A
Select * from #T


id c1 c2
-----------
1 101 0
2 102 0
3 103 0

(3 row(s) affected)

Wednesday, December 29, 2010

EXECUTE AS and REVERT:

EXECUTE AS and REVERT:
The EXECUTE AS and the REVERT are used to change the execution logins. We see the example to change the connection details and coming back to old one.


The client has connected using the “Test” user and “EXECUTE AS” statement changes the Login connection as “Demo” user. The second result set shows that.

The REVERT command switches the execution context back to the Test user.

Explanation about the keywords:
EXECUTE AS Clause - helps us to define the execution context.

REVERT - Switches the execution context back to the caller of the last EXECUTE AS statement

SUSER_NAME() - Returns the login identification name of the user.

USER_NAME() - Returns a database user name.

Wednesday, December 8, 2010

Error message is: The directory name is invalid in SQL Server

Error Message: in SQL server
An error occurred while executing batch. Error message is: The directory name is invalid.


Reason:
This error will come when your disk space is very low or if the directory is protected.

Solution:check your disk space or try connecting the SQL server from some other machine.


Sunday, November 7, 2010

Careful with SQL server Update

Careful with SQL server Update:
The Blog is sharing my experience with the update statement in SQL server. We need to careful when write a update statement in SQL server like a scenarios below.

Initial setup:

create table A
( c1 int,
c2 int
)

create table B
( c1 int,
c2 int
)

insert into A values (101,0)
insert into A values (102,0)
insert into A values (103,0)

insert into B values (101,7)
insert into B values (102,5)
insert into B values (103,4)
insert into B values (103,6)


Tables with value:


select * from A



select * from B





The field in c2 is having Zero for all the fields. We need to update the field c2 in table A from the table B. But we have multiple values in table B for the value 103.


Based on the above scenario, the update statement has to fail. We mostly use the Inner join method of updates in SQL server Like a below statement.

begin tran
update A
set A.c2=B.c2
from A,B
where A.c1=B.c1

The above statement is updating the records. The above method is taking care of the multiple records.





The below method is giving the error message due to multiple records.

begin tran
update A set A.c2
=(select B.c2 from B
where A.c1=B.c1
)

We need to careful when we write a update statements.

Sunday, October 3, 2010

Character of DUAL


Character of DUAL:
DUAL is owned by SYS. SYS owns the data dictionary, therefore DUAL is part of the data dictionary. You are not to modify the data dictionary via SQL ever.
Dual table and its purpose:
Dual is just a convenience table. You don't need to use it, you can use anything you want. The advantage to dual is the optimizer understands dual is a special one row, one column table -- when you use it in queries, it uses this knowledge when developing the plan.
The Dual table structure, field and record details are below.
SQL> desc dual;
Name Null? Type
-------------- -------- ----------------------------
DUMMY VARCHAR2(1)
SQL> select * from dual;
D
-
X
SQL> select count(*) from dual;
COUNT(*)
----------
1DUAL is owned by SYS:

The below scripts connect to DB as SYSTEM user and try to modify the DUAL table. The insert statement is not able to modify the table which is owned by SYS user.

SQL> conn system/password@oraprc
Connected.

SQL> INSERT INTO DUAL VALUES ('X');
INSERT INTO DUAL VALUES ('X')
*
ERROR at line 1:
ORA-01031: insufficient privileges

We change the connection to SYS user and try the same and it is able to insert the data.
SQL> conn sys/password@oraprc as sysdba
Connected.

SQL> INSERT INTO DUAL VALUES ('X');

1 row created.

SQL> select count(*) from dual;
COUNT(*)
----------

2
Special one row:
The definition has mentioned that DUAL is special one row, one column table. But we can insert the records in that table. So the count got changed in that.

SQL> INSERT INTO DUAL VALUES ('X');
1 row created.

SQL> select count(*) from dual;
COUNT(*)
---------

3

SQL> select * from dual;
D
-
X

The reason is that, the optimizer understands dual is a magic, special 1 row table. It stopped on the select * because there is to be one row in there. It’s just the way it works.

Delete operation on DUAL:
SQL> select count(*) from dual;
COUNT(*)
----------

3
SQL> delete from dual;
1 row deleted.


SQL> select count(*) from dual;
COUNT(*)
----------

2

SQL> delete from dual;
1 row deleted.

SQL> select count(*) from dual;
COUNT(*)
----------

1
The delete statement is deleting single record each time. The reason is that, the DUAL is special one row table.

Checking with FUCTION:

We create the function which return the value “1” and check the same function with DUAL and user created table.

SQL> select count(*) from dual;

COUNT(*)
----------
2


SQL> create or replace function foo return number
2 as
3 x number;
4 begin
5 x:=1;
6 return 1;
7 end;
8 /
Function created.

SQL> create table emp
2 ( num int);

Table created.

SQL> insert into emp values(1);

1 row created.

SQL> insert into emp values(2);

1 row created.

SQL> insert into emp values(3);

1 row created.

SQL> insert into emp values(4);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from emp;

NUM
----------
1
2
3
4
SQL> select foo from dual;

FOO
----------
1
SQL> select foo from emp;
FOO
----------
1
1
1
1

The EMP table is having the 4 records and it is returning the 4 records when do the select on that table. We have checked the same thing with the DUAL table. The count of the DUAL is 2 and it is returning one record.

Friday, September 24, 2010

Autotrace from SYS - Some things appear to work but don't really

Autotrace from SYS - Some things appear to work but don't really

We use autotrace to get the Execution Plan and Statistics. It appear to work but don't really from SYS user. We see that.

SQL> create table t ( num number(2), name varchar2(10));

Table created.



SQL> insert into t values(1,'A');

1 row created.

SQL> insert into t values(2,'A');

1 row created.

SQL> select * from t;

NUM NAME
---------- ----------
1 A
2 A



SQL> set autotrace on;
SQL> select * from t;

NUM NAME
---------- ----------
1 A
2 A
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'T'




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
463 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

We see the same thing from the SYS user.

SQL> conn sys/password@oraprc as sysdba
Connected.
SQL> set autotrace on;
SQL> select * from system.t;

NUM NAME
---------- ----------
1 A
2 A


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'T'




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
0 bytes sent via SQL*Net to client
0 bytes received via SQL*Net from client
0 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

SYSDBA, SYSOPER, "internal" and sys in general shouldn't be used for anything other then admin.


Saturday, September 4, 2010

CHAR Vs VARCHAR



CHAR Vs VARCHAR:


We see the information about the CHAR and VARCHAR in this Blog.

Create table t ( X varchar2(30) , Y char(30));
Insert into t values ('a','a');

The above table is having the two fields X & Y and corresponding data types are varchar2 and char. The CHAR is nothing more than a VARCHAR2 that is blank padded out to the maximum length. That is difference between the column X and Y.
The field X consumes 3 bytes – ( NULL indicator, leading byte length , 1 byte for ‘a’).

The field Y consumes 32 bytes–(NULL indicator, leading byte length, 30 byte for ‘a ’).

Need to consider the below points when we use the CHAR data type:

1. The a CHAR type always blank pads the resulting string out to a fixed width, we discover rapidly that it consumes maximum storage both in the table segment and any index segments.

2. Another important reason to avoid CHAR types: they create confusion in applications that need to retrieve this information (many cannot “find” their data after storing it). The reason for this relates to the rules of character string comparison and the strictness with which they are performed. The Below scripts proves that.

SQL@ORA9i> create table t
2 ( char_column char(20),
3 varchar2_column varchar2(20)
4 );

Table created.

SQL@ORA9i> insert into t values ( 'Hello World', 'Hello World' );

1 row created.

SQL@ORA9i> select * from t;

CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World

SQL@ORA9i> select * from t where char_column = 'Hello World';

CHAR_COLUMN VARCHAR2_COLUMN
------------ --------------------
Hello World Hello World

SQL@ORA9i> select * from t where varchar2_column = 'Hello World';

CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World


The above result looks like identical but, in fact, some implicit conversion has taken place and the CHAR(11) literal ‘Hello World’ has been promoted to a CHAR(20) and blank padded when compared to the CHAR column. The reason is, ‘Hello World ’ is not the same as ‘Hello World’ without the trailing spaces. We can confirm that these two strings are different.

SQL@ORA9i> select * from t where char_column = varchar2_column;
no rows selected


They are not equal to each other. We would have to either blank pad out the VARCHAR2_COLUMN to be 20 bytes in length or trim the trailing blanks from the CHAR_COLUMN, as follows:

SQL@ORA9i> select * from t where trim(char_column) = varchar2_column;

CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World

SQL@ORA9i> select * from t where char_column = rpad( varchar2_column, 20 );

CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World
The problem arises with applications that use variable length strings when they bind inputs, with the resulting “no data found”


SQL@ORA9i> variable varchar2_bv varchar2(20)
SQL@ORA9i> exec :varchar2_bv := 'Hello World';

PL/SQL procedure successfully completed.

SQL@ORA9i> select * from t where char_column = :varchar2_bv;

no rows selected

SQL@ORA9i> select * from t where varchar2_column = :varchar2_bv;

CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World
The above search for VARCHAR2 string worked but not for CHAR. The VARCHAR2 bind variable will not be promoted to a CHAR(20) in the same way as a character string literal. At this point, many programmers form the opinion that “bind variables don’t work; we have to use literals.” That would be a very bad decision indeed.

The solution is to bind using a CHAR type:

SQL@ORA9i> variable char_bv char(20)
SQL@ORA9i> exec :char_bv := 'Hello World';

PL/SQL procedure successfully completed.

SQL@ORA9i> select * from t where char_column = :char_bv;

CHAR_COLUMN VARCHAR2_COLUMN
-------------------- -------------------
Hello World Hello World

SQL@ORA9i> select * from t where varchar2_column = :char_bv;

no rows selected

We will be running into this issue constantly if we mix and match CHAR and VARCHAR.

Monday, August 16, 2010

Performance Improvement – Index scan with NULL condition

Performance Improvement – Index scan with NULL condition

What will be our first step when we start tuning the queries?


The first step is creating the index. But simply creating the index does not improve the performance in all scenarios.

The below scripts are proves that.


SQL> create table t
2 as
3 select object_name name, a.*
4 from all_objects a;

Table created.

SQL> alter table t modify name null;

Table altered.

SQL> create index t_idx on t(name);

Index created.

SQL> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

SQL> variable b1 varchar2(30)
SQL> exec :b1 := 'T'

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly explain
SQL> delete from t
2 where name = :b1 or name is null;

2 rows deleted.


Execution Plan
----------------------------------------------------------
0 DELETE STATEMENT Optimizer=CHOOSE (Cost=49 Card=2 Bytes=48)
1 0 DELETE OF 'T'
2 1 TABLE ACCESS (FULL) OF 'T' (Cost=49 Card=2 Bytes=48)




The NAME field is having the index but still it is going for the FULL table scan. the index (t_idx) is not getting used here.

The reason is because NAME is nullable, the index on only on name and entirely null keys are NOT entered into b*tree indexes.


SQL> rollback;

Rollback complete.

SQL> drop index t_idx;

Index dropped.

SQL> create index t_idx on t(name,0);

Index created.

SQL> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

SQL> delete from t
2 where name = :b1 or name is null
3 ;

2 rows deleted.


Execution Plan
----------------------------------------------------------
0 DELETE STATEMENT Optimizer=CHOOSE (Cost=26 Card=2 Bytes=48)
1 0 DELETE OF 'T'
2 1 INDEX (FULL SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=26 Card=
2 Bytes=48)




The above method is going for the index scan and avoids the table scan.

Wednesday, August 11, 2010

Output clause

Output clause:

OUTPUT clause has accesses to inserted and deleted tables just like triggers. OUTPUT clause can be used to return values to client clause. OUTPUT clause can be used with INSERT, UPDATE, or DELETE to identify the actual rows affected by these statements.


Let us understand OUTPUT clause using example.

Initial setup:


create table #t
(id int,
p_name char(10)
)

insert into #t values ( 1,'AAAAA')
insert into #t values ( 2,'BBBBB')
insert into #t values ( 3,'CCCCC')
insert into #t values ( 4,'DDDDD')
insert into #t values ( 5,'EEEEE')

select * from #t




Scneario1:

Archive the records into #t_arc table when records get deleted in #t table.

Normal solution is that, insert into #t_arc table before delete the records in the #t table based on the conditions.

create table #t_arc
(id int,
p_name char(10)
)

insert into #t_arc
select * from
where id =3

delete from #t
where id =3

The output clause avoids the execution of the two statements.

delete from #t
output deleted.* into #t_arc
where id =3



One of the very good advantages is eliminating the data access from table for each time. The DELETE … OUTPUT statement access the data once and do the operations.

Scneario2:
We see another example when we insert the data into table.

create table #t_audit
(id int,
load_time datetime
)


The below statement loads the audit table when loads the data into #t table.

insert into #t
output inserted.id,getdate() into #t_audit
values ( 6,'FFFFF')


Thursday, August 5, 2010

Performance Improvement - Part 3: Correlated Update tuning

Performance Improvement - Part 3: Correlated Update tuning:

We have seen the small difference in correlated update statement execution in the previous Blog. Please refer below link if need to check it.


http://karthikeyanbaskaran.blogspot.com/2010/06/difference-in-correlated-update.html

The scenario is, need to update the “Address” field in table”Main” from the “Sub” table using the “id” field. The -1 needs to update in all other columns. The below update method solve it in single statement.

update main mset address = nvl((select address from sub s where m.id=s.id),-1);
Correlated update statement process for all the rows in the table “MAIN”. We see how it works when we load the lot records in the table “MAIN” and performance improvement method.


SQL> insert into main
2 select * from main;
............
.......
....
SQL> insert into main
2 select * from main;

Now the table “Main” is having the 1048576 records and table “Sub” is having the 3 records.

SQL> select count(*) from main;

COUNT(*)
---------

1048576

SQL> select count(*) from sub;

COUNT(*)
----------

3
The number of distinct address data is same now and we start our analysis process.

SQL> select address,count(*) from main
2 group by address;

ADDRESS COUNT(*)
---------- ----------
NNNNNNN 1048576

The below update statement works very well for our scenario. But every time it updates for the 1048576 records.

All the time it updates the 1048576 records, even if there are less difference needs to update it in the table “Main” from “Sub”.

SQL> set timing on
SQL> set autotrace traceonly
SQL> update main m
2 set address = nvl((select address from sub s where m.id=s.id),-1);

1048576 rows updated.

Elapsed: 00:01:44.04
Execution Plan
----------------------------------------------------------
0 UPDATE STATEMENT Optimizer=CHOOSE
1 0 UPDATE OF 'MAIN'
2 1 TABLE ACCESS (FULL) OF 'MAIN'
3 1 TABLE ACCESS (FULL) OF 'SUB'

Statistics
----------------------------------------------------------
805 recursive calls
1073721 db block gets
2659 consistent gets
2124 physical reads
252412780 redo size
618 bytes sent via SQL*Net to client
582 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
1048576 rows processed


The statement took 01:44 minutes to complete and it generated the 252412780 bytes. We roll back the results.


SQL> rollback;

Rollback complete.

Elapsed: 00:02:26.09

Performance improved script:

The WHERE condition has added to update statement and it updates the records which is difference comparing with Table “Main” and “Sub”.

SQL> update main m
2 set address = nvl((select address from sub s where m.id=s.id),-1)
3 where address is null
4 or address <> nvl((select address from sub s where m.id=s.id),-1);

1048576 rows updated.

Elapsed: 00:01:42.05

Execution Plan
----------------------------------------------------------
0 UPDATE STATEMENT Optimizer=CHOOSE
1 0 UPDATE OF 'MAIN'
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'MAIN'
4 2 TABLE ACCESS (FULL) OF 'SUB'
5 1 TABLE ACCESS (FULL) OF 'SUB'

Statistics
----------------------------------------------------------
536 recursive calls
1073630 db block gets
2611 consistent gets
1963 physical reads
252411468 redo size
627 bytes sent via SQL*Net to client
672 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1048576 rows processed


The statement took 01:42 minutes to complete and it generated the 252411468 bytes. There is no big difference when it updates all the records.

SQL> select address,count(*) from main
2 group by address;

ADDRESS COUNT(*)
---------- ----------
-1 524288
A 262144
C 262144

Elapsed: 00:00:01.03
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00

We change the some set of records to check the execution time.

SQL> update sub set address='M' where id =4;

1 row updated.

Elapsed: 00:00:00.02
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00
SQL> set autotrace traceonly
SQL>
SQL> update main m
2 set address = nvl((select address from sub s where m.id=s.id),-1);
1048576 rows updated.

Elapsed: 00:01:51.00

Execution Plan
----------------------------------------------------------
0 UPDATE STATEMENT Optimizer=CHOOSE
1 0 UPDATE OF 'MAIN'
2 1 TABLE ACCESS (FULL) OF 'MAIN'
3 1 TABLE ACCESS (FULL) OF 'SUB'




Statistics
----------------------------------------------------------
624 recursive calls
1072662 db block gets
4953 consistent gets
2313 physical reads
248380252 redo size
629 bytes sent via SQL*Net to client
584 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1048576 rows processed



The statement took 01:51minutes to complete and it generated the 248380252 bytes.


SQL> set autotrace off;
SQL> select address,count(*) from main
2 group by address;

ADDRESS COUNT(*)
---------- ----------
-1 524288
A 262144
M 262144

Elapsed: 00:00:00.09

Rollback the result.

SQL> rollback;

Rollback complete.

Elapsed: 00:04:40.08
SQL> select address,count(*) from main
2 group by address;

ADDRESS COUNT(*)
---------- ----------
-1 524288
A 262144
C 262144


Performance improved script:

SQL> set autotrace traceonly

SQL> update main m
2 set address = nvl((select address from sub s where m.id=s.id),-1)
3 where address is null
4 or address <> nvl((select address from sub s where m.id=s.id),-1);

262144 rows updated.

Elapsed: 00:00:45.04

Execution Plan
----------------------------------------------------------
0 UPDATE STATEMENT Optimizer=CHOOSE
1 0 UPDATE OF 'MAIN'
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'MAIN'
4 2 TABLE ACCESS (FULL) OF 'SUB'
5 1 TABLE ACCESS (FULL) OF 'SUB'




Statistics
----------------------------------------------------------
176 recursive calls
268198 db block gets
2421 consistent gets
2054 physical reads
62104336 redo size
628 bytes sent via SQL*Net to client
672 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
262144 rows processed


The statement took 00:45.04 seconds to complete and it generated the 62104336 bytes.

SQL> set autotrace off;
SQL> select 248380252-62104336 redo_diff from dual;

REDO_DIFF
----------
186275916

The second update statement is avoiding the 186275916 bytes of the redo. So it is saving space in the UNDO and space in archive folders.

This method gives the good performance if it updates some part records in table.


Monday, July 19, 2010

Performance Improvement - Part 2: Hard parse is not bad

Performance Improvement - Part 2: Hard parse is not bad

In previous post, we have seen the soft parsing is doing less operation comparing with hard parsing.

Please have a look on below post before read this one.

http://karthikeyanbaskaran.blogspot.com/2010/07/performance-improvement-part-1-parsing.html

The Hard parse includes the parse, optimize, generation of the plan for the query and soft parse skips optimize and generation of the plan. So we decided that hard parse is not good for application performance.

Some times soft parsing is having performance issue. The below scripts proves that.
SQL@ORA10G> create table t
2 as
3 select case when rownum = 1
4 then 1 else 99 end id, a.*
5 from all_objects a;

Table created.

SQL@ORA10G> create index t_idx on t(id);

Index created.


SQL@ORA10G> analyze table t compute statistics;

Table analyzed.



The WHERE ID=1 condition will return one record and WHERE ID=99 will return all of the rest (about 40,000 records). Also, the optimizer is very aware of this fact, we can definitely see different plans for different inputs, as we are passing the literals in the WHERE clause.


SQL@ORA10G> set autotrace traceonly explain
SQL@ORA10G> select * from t where id = 1;

Execution Plan
----------------------------------------------------------


0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=141)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=1 Bytes=141)
2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=1)

SQL@ORA10G> select * from t where id = 99;

Execution Plan
----------------------------------------------------------


0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=137 Card=34323 Bytes=4839543)
1 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=137 Card=34323 Bytes=4839543)






From Oracle9i Database Release 1 through Oracle Database 10g Release 2, Oracle Database will wait until the cursor is opened to do the actual optimization of the query—it will wait for the bind variable value to be supplied by the application before figuring out the right way to optimize the query. This is called bind variable peeking, when the optimizer first looks at the bind values and then optimizes the query.



In this case, however, depending on which inputs are used to first run the query, the database will either choose a full scan or an index range scan plus table access by index rowid. And in Oracle9i Database Release 1 through Oracle Database 10g Release 2, that is the plan that will be used to execute the SELECT * FROM t WHERE ID = :ID query, regardless of the subsequent bind values, until the query is hard-parsed and optimized again.

SQL@ORA10G> exec :id := 99

PL/SQL procedure successfully completed.

SQL@ORA10G> select * from t where id = :id;

Execution Plan
----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=137 Card=20391 Bytes=1814799)

1 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=137 Card=20391 Bytes=1814799)





So we started off with ID=99 as the bind, and the optimizer chose a full scan. Therefore, regardless of the bind value, the database will execute a full scan from now on. For example:
SQL@ORA10G> exec :id :=1

PL/SQL procedure successfully completed.

SQL@ORA10G> select * from t where id = :id;

Execution Plan
----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=137 Card=20391 Bytes=1814799)

1 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=137 Card=20391 Bytes=1814799)




The above result demonstrates that it is not going for the index range scan or table access by index rowid. In this case, the logical I/Os (Consistent gets) are more because of the FULL table scan instead of the index scan.

The result proves the “poorly performing query”.

Tuesday, July 13, 2010

Performance Improvement - Part 1: Parsing

Performance Improvement - Part 1: Parsing

In this article, we see how to improve the performance of the SQL queries. Before that we should know about the parsing and types of that.

Parsing:

This is the first step in the processing of any statement in Oracle. Parsing is the process of breaking the submitted statement down into its component parts. Determining what type of statement it is (whether Query, DML, DDL) and performing various checks on it.
The parsing process performs two main functions:

· Syntax Check
· Semantic Analysis

Syntax Check:

The syntax check performs the below operation.

Is the statement a valid one?
Does it make sense given the SQL grammar documented in the SQL Reference Manual? Does it follow all of the rules for SQL?

Semantic Analysis:
The semantic going beyond the syntax and it is performing the below operation.

Is the statement valid in light of the objects in the database (do the tables and columns referenced exist)?
Do you have access to the objects?
Are the proper privileges in place?
Are there ambiguities in the statement?
Soft parsing and hard parsing:
The next step in the parse operation is to see if the statement we are currently parsing has already in fact been processed by some other session. We may be in luck here If it has, we can skip the next two steps in the process, that is

1 .Optimization
2. Row source generation.

If we can skip these next two steps in the process, we have done what is known as a Soft Parse.


If we cannot, if we must do all of the steps, we are performing what is known as a Hard Parse. Hard parse includes the parse, optimize, generation of the plan for the query.
Example for Hard Parse:
The below scripts demonstrate the hard parsing.

SQL> ALTER SYSTEM FLUSH SHARED_POOL;

System altered.

SQL> create table t
(
id number(2),
Des varchar2(20)
);

Table created.

insert into t values(1,'AA');
insert into t values(2,'BB');
insert into t values(3,'CC');
insert into t values(4,'DD');
insert into t values(5,'EE');
insert into t values(6,'FF');


SQL> select * from t;

ID DES
--- --------
1 AA
2 BB
3 CC
4 DD
5 EE
6 FF

6 rows selected.


SQL> select SQl_TEXT,executions,hash_value,child_latch,child_address from v$sql where sql_text like 'select * from t where id %';

no rows selected

SQL> select * from t where id = 2;

ID DES
--- --------
2 BB

SQL> select * from t where id = 2;

ID DES
--- --------
2 BB

SQL> select * from t where id = 3;

ID DES
--- --------
3 CC

SQL> select SQl_TEXT,executions,hash_value,child_latch,child_address from v$sql where sql_text like 'select * from t where id %';





The above result is showing that, it is creating the different hash value for “where id = 2” and “where id =3 “. That means it is going for the hard parse for the each time when we change the condition.

Example for Soft parse:

The Execution column is showing 2 for the “where id =2 “. First time it has gone for the hard parse and second time it is gone for the soft parse.

How to make soft parse for “where id =2 “and “where id =3 “. We use the bind variable and it solves the issue.



SQL> variable id number
SQL> exec :id := 1;

PL/SQL procedure successfully completed.

SQL> select * from t where id = :id;

ID DES
--- --------
1 AA

SQL> exec :id :=2 ;

PL/SQL procedure successfully completed.

SQL> select * from t where id = :id;

ID DES
--- ------
2 BB

SQL> exec :id :=2 ;

PL/SQL procedure successfully completed.

SQL> select * from t where id = :id;

ID DES
--- -------
2 BB


SQL> select SQl_TEXT,executions,hash_value,child_latch,child_address from v$sql where sql_text like 'select * from t where id %';


It is not creating the different plan for each time when we use the bind variable. The hash value vale has created for the script “where id = :id “ and the value of the variable (:id) is changing each time. Refer the “Execution field, It is showing 3 for the different values.
Hard parsing is very CPU intensive. If our applications we want a very high performance then our queries to be Soft Parsed (to be able to skip the optimize/generate phases)

If we have to Hard Parse a large percentage of our queries, our system will function slowly and in some cases not at all.

Now we have a strong proof for the soft parsing and that improve the performance. But soft parsing is not always good and hard parsing is not always bad. We see that in the next Blog.

Monday, July 5, 2010

Reusable scripts - Part1

Find the letter occurrences in the statement:

The below query is finding the number of occurrence letter “a” in the string.

select length('Find a letter in a string') - length(replace('Find a letter in a string', 'a','')) as "Number of occurrence" from dual ;

Number of occurrence
--------------------
2


Remove the alphabetic letters in the given string:

declare
l_start number;
len number;
i number;
p_str varchar2(1000) := 'aa123yksjds45';
p_str1 varchar2(1000);
res varchar2(1000);
begin
i:=0;
len :=length(p_str);
WHILE i <= len

loop
len := ASCII(substr(p_str,i,1));
IF len > 47 AND len < 58 then
i:= i+1;
else
p_str := replace(p_str,substr(p_str,i,1),'');
END IF;
end loop;
dbms_output.put_line(p_str);
end;

Result:

12345

Divide the given comma separated words:

declare v_comma_position number;
v_len number;
p_comma_val varchar2(1000) := 'value1,value2,value3,value4,value5'; p_comma_tmp varchar2(1000);
v_result varchar2(1000);
begin p_comma_val := p_comma_val ',';
v_len :=length(p_comma_val);
WHILE v_len >= 0
loop
v_comma_position := instr(p_comma_val,',')+1;
p_comma_tmp := '';v_result := '';
p_comma_tmp := p_comma_val;
p_comma_val:='';
v_result := substr(p_comma_tmp,1,v_comma_position-2);
p_comma_val := substr(p_comma_tmp,v_comma_position,v_len);
v_len := v_len - v_comma_position;
dbms_output.put_line( v_result);
end loop;
end;

Result:

value1
value2
value3
value4
value5


Monday, June 28, 2010

creating the composite index with more than 900 bytes



Creating the composite index with more than 900 bytes:
Sql server is having some constraints while creating the index. The index entry will not allow if the data size is more than 900 bytes. It gives the error while inserting the data.

There is a way to insert the more than 900 bytes into composite index. we see that in this article.

Intial setup:

Creating the table with two columns and total size is more than 900 bytes.

use tempdb
create table tmp
(
c1 varchar(10),
c2 varchar(1000)
)


Index creation:

The create index statement gives the warnings. Because total size is havng more than 900 bytes.

create index idx_tmp on tmp(c1,c2)

Warning! The maximum key length is 900 bytes. The index 'idx_tmp' has maximum length of 1010 bytes. For some combination of large values, the insert/update operation will fail.

Test data Length:
select len('

test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
')





Insert statement with more than 900 bytes:

The toltal size of the insert is 1010 bytes here. It is not allowing to insert and giving the error.


insert into tmp values( 'hai',
'test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data'
)

INCLUDE clause:

There is another way to avoid this error and create a index on both the columns.

drop index idx_tmp on tmp

By using the INCLUDE clause of the CREATE INDEX statement, the index key could be defined as (c1) and c2 defined as a nonkey column. In this way, the index key size would be 10 bytes.


CREATE INDEX idx_tmp2
ON tmp (c1)
INCLUDE (c2);

Insert statement with more than 900 bytes:

The toltal size of the insert is 1010 bytes here. This time it is allowing to insert

insert into tmp values( 'hai',
'test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data
test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data test_data'
)



The index has included both the columns. The below query proves that.

select b.name index_name,a.name column_name from sys.columns a
inner join sys.indexes b
on a.object_id=b.object_id
where b.name ='idx_tmp2'





Addition to that we verify the explain plan. It goes for the index scan.
select * from tmp where c1='10'


select * from tmp where c2='10'



Tuesday, June 22, 2010

Good to Know in SQL server 2008

Good to Know in SQL server 2008

1. Insert values:

Many times we use “INSERT INTO tab_name VALUES (….) “way for inserting the data into table. There is a simple way in the SQL server 2008 to avoid multiple INSERT clauses for more than one record. It is possible in single statement.


use tempdb

Create table Dept(Id int, DName varchar(100))

Insert into Dept
values
(1,'A'),
(2,'B'),
(3,'C'),
(4,'D')

select * from Dept




2.Declare and initiate in same lines:

Old style ( SQL 2000 and 2005) :

declare @i int
set @i = 10
declare @dt datetime
set @dt = GETDATE()
select @i as Col1, @dt as Col2




New Style (SQL server2008):

declare @i int = 10
declare @dt datetime = getdate()
select @i as Col1, @dt col2
go

Monday, June 21, 2010

Pivot in SQL server

PIVOT:

PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output.

CREATE TABLE Sales (Make varchar(8), Year int, Sales int)
GO
Insert into Sales values ('Honda',2005,20000)
Insert into Sales values ('Honda',2005,60000)
Insert into Sales values ('Mahindra',2005,40000)
Insert into Sales values ('Honda',2006,30000)
Insert into Sales values ('Mahindra',2006,30000)
Insert into Sales values ('Mahindra',2006,60000)
Insert into Sales values ('Mahindra',2007,8000)
GO

Actual Group by, get Number of cars sold for each make in each year.
Select Make,YEAR,SUM(Sales) from Sales
group by Make,Year

Transposed of above query

Select * from Sales pivot(Sum(sales) for Year in ([2001],[2002],[2003],[2004],[2005],[2006],[2007])) t
go


Restricting the columns which we need to show.

Select * from Sales pivot(Sum(sales) for Year in ([2005],[2006],[2007])) t



List sum of each product sold for each year and product should have been sold all the years and not just 1 year.

Select * from Sales pivot(Sum(sales) for Year in ([2005],[2006],[2007])) t
where [2007] is not null


Thursday, June 10, 2010

Difference in Correlated Update

Difference in Correlated Update :



I came across the Correlated update statement with below format. It updates the address filed in the “Main” table and takes it from the “sub” table. The NVL function has given outside the sub query.

Outer NVL Query:
update main m
set address = nvl((select address from sub s where m.id=s.id),-1);

We can give the NVL function with the sub query (specifically for that column). Then what is the difference between “Outer NVL Query” and “Inner NVL Query”.
Inner NVL Query:
update main m
set address = (select nvl(address,-1) from sub s where m.id=s.id)

Initial setup:
This step creates the tables and populates it.

create table main ( id number, Address varchar2(10));

insert into main values (1,'NNNNNNN');
insert into main values (2,'NNNNNNN');
insert into main values (3,'NNNNNNN');
insert into main values (4,'NNNNNNN');

create table sub ( id number, Address varchar2(10));

insert into sub values (1,'A');
insert into sub values (2,null);
insert into sub values (4,'C');

select * from main;

select * from sub

Inner NVL Query:

update main m
set address = (select nvl(address,-1) from sub s where m.id=s.id);

It is applying the NVL function if NULL value comes from the “SUB” table. It is putting the NULL value in the “Main” table if it does not find the match in the “Sub” table.

We think that the outer NVL query works in the same way. But there is some difference.
Outer NVL Query:
update main m
set address = nvl((select address from sub s where m.id=s.id),-1)

We can see the difference in below result. It is applying the NVL function in the both the situation.

This is small Difference in the Correlated Update.

Tuesday, June 8, 2010

Shell script to find the latest file

The below shell script will help to find the latest file and name starting with “sales”.

Example the below files are available in UNIX box.
sales_2007_02_23_0010.dat
sales_2007_02_24_0024.dat
sales_2007_02_24_0714.dat
sales_2007_02_25_0010.dat
sales_2007_02_25_0525.dat
sales_2007_02_25_0835.dat

There are 3 files on 25th with different time stamp. The aim is identify the latest file. So the script check the latest file based on the time when that file has created in the UNIX box.

Script : latest_file.sh

#!/usr/bin/ksh
rm -f /home/hb/ms/hb/karthi/sales.dat
val=`ls -lrt sales* tail -1 awk '{print $9}'`
echo $val
ls /home/hb/ms/hb/karthi/$val
cp /home/hb/ms/hb/karthi/$val /home/hb/ms/hb/karthi/sales.dat

Monday, June 7, 2010

Cumulative Sum

Ways to get a Cumulative Sum:

Initial setup:
Creating the table and populating the table.

CREATE TABLE #Sales (DayCount smallint, Sales money, RunningTotal money)


INSERT INTO #Sales (DayCount, Sales, RunningTotal)
SELECT 1, 10, 0
union all
SELECT 2, 7, 0
union all
SELECT 3, 2, 0
union all
SELECT 4, 6, 0

GO

SELECT * FROM #Sales ORDER BY DayCount

GO





Method 1:
DECLARE @RunningTotal money
SET @RunningTotal = 0

UPDATE #Sales
SET @RunningTotal = RunningTotal = @RunningTotal + Sales

GO

SELECT * FROM #Sales ORDER BY DayCount

GO




Method 2:

SELECT
a.DayCount
, a.Sales
, SUM(b.Sales) AS 'Running Total'
FROM
#Sales a
JOIN #Sales b ON b.DayCount <= a.DayCount GROUP BY a.DayCount , a.Sales ORDER BY a.DayCount , a.Sales

Wednesday, June 2, 2010

Partitioning in SQL server


Partitioning in SQL server with small sample:



Summary: Table partitioning can make very large tables and indexes easier to manage, and improve the performance of appropriately filtered queries.

Table Partitioning Components:

The partition function, partition scheme, and partitioned table or index form a dependency tree, with the partition function at the top, the partition scheme depending on the partition function, and then the partitioned table or index depending on being linked to a partition scheme.


Partition function:
The partition function defines the boundary values of the initial set of partitions and the data type of the partitioned column:
o It makes no reference to any tables or disk storage.
o It forms the basis for one or more partition schemes.


create PARTITION function PF_dept(int)
AS range right FOR values (10,20,30)

Partition scheme:
The partition scheme maps particular partitions to filegroups:
o A given partition scheme can be used for one or more partitioned tables, indexes, and indexed views.
create partition scheme ps_dept
as partition PF_dept
TO (FG1,FG2,FG3,FG4)

Partitioned table:
The partitioned table or index is tied to a particular partition scheme when it is created:
o The partition table has only an indirect relationship, through the partition scheme, to the partition function.
create table dept
(deptid int,
deptname varchar(10)
)
ON ps_dept(deptid)


CREATE TABLE EMP
(EMPID INT,
EMPNAME VARCHAR(10),
DEPT_ID INT)
ON PS_DEPT(DEPT_ID)

WITH TIES

WITH TIES in SQL server

Specifies that additional rows be returned from the base result set with the same value in the ORDER BY columns appearing as the last of the TOP n (PERCENT) rows. TOP...WITH TIES can be specified only in SELECT statements, and only if an ORDER BY clause is specified.

create table sales( customer_name varchar(10), sales_amount int)

insert into sales values
('A',6000),
('B',6000),
('C',10000),
('D',12000),
('E',13000),
('F',15000),
('G',5000),
('H',4000)

Top 5:
The “Top 5” gives first 5 records based on the condition in query.

select top 5 * from sales
order by sales_amount desc

customer_name sales_amount
------------- ------------
F 15000
E 13000
D 12000
C 10000
B 6000

(5 row(s) affected)

Top 5 WITH TIES:
The “Top 5 WITH TIES” gives 6 records because the base result set with the same value in the ORDER BY columns appearing as the last of the TOP 5.

select top 5 WITH TIES * from sales
order by sales_amount desc

customer_name sales_amount
------------- ------------
F 15000
E 13000
D 12000
C 10000
A 6000
B 6000

(6 row(s) affected)

Comparing the Contents of Two Tables in SQL Server

Comparing the Contents of Two Tables:

create table a
(
c1 int,
c2 varchar(3),
c3 varchar(3)
)


create table b
(
c1 int,
c2 varchar(3),
c3 varchar(3)
)

insert into a values(1,'x','y')
insert into a values(2,'xx','y')
insert into a values(3,'x','y')

insert into b values(1,'x','y')
insert into b values(2,'x','y')
insert into b values(3,'x','yy')

select * from a

C1 C2 C3
----- -- --
1 x y
2 xx y
3 x y

select * from b
C1 C2 C3
----- -- --
1 x y
2 x y
3 x yy


The first rows are the same, but the second and third rows differ. This is how we can find them:
select i.c1,i.c2, i.c3,
count(i.src1) CNT1,
count(i.src2) CNT2
from
( select a.*,
1 src1,
null as src2
from a
union all
select b.*,
null as src1,
2 src2
from b
)i
group by i.c1,i.c2,i.c3
having count(i.src1) <> count(i.src2)


C1 C2 C3 CNT1 CNT2
--- -- -- ---- ----
2 x y 0 1
2 xx y 1 0
3 x y 1 0
3 x yy 0 1


The below link having the script in Oracle:

http://karthikeyanbaskaran.blogspot.com/2010/02/comparing-contents-of-two-tables.html

Spool operation in SQL server



SQL Server doesn't have a SPOOL command like Oracle does for writing to files,
But there are other ways of doing what we want.

1. For reading, use xp_cmdshell and the ECHO command. Use the > or >> redirection symbols to either create or append to a file.

xp_cmdshell "@ECHO test message >> C:\file.fil"
2. Put the information you want into a table (note this can't be an ordinary temporary table, but it can be a global temporary table) and then bcp it out to a file via xp_cmdshell.

xp_cmdshell "bcp .. out c:\file.fil -Usa -P -c"
3. Run the select through ISQL via xp_cmdshell and use the -o parameter to output the results to a file. This example uses the -E parameter to avoild hard-coding a userid.

declare @str varchar(255)
select @str = 'isql -Q"select * from " -E -oc:\file.fil'
exec master..xp_cmdshell @str

Enable xp_cmdshell on a MS SQL Server Instance

Enable xp_cmdshell on a MS SQL Server Instance :

xp_cmdshell has the power to make modifications on the outer world of SQL Server. So this power has to be controlled in the security concepts and be manageable. In the early versions of Microsoft SQL Server the xp_cmdshell extended procedure was enabled default. This caused some security gaps for SQL Server owners. Although some administrators do not use xp_cmdshell functionality, it was out of control and can be used in an unsecure way by a sql server developer. Microsoft now enables SQL Server administrators to enable or disable the xp_cmdshell extenden procedure and releases the SQL Server products with xp_cmdshell is disabled fby default. So if you think you are capable of taking the security risks and prevent those risks you can enable xp_cmdshell by using the sp_configure stored procedure.
The below t-sql code displays how xp_cmdshell can be used to delete a file named delete-me-file.txt in the root folder of C drive


xp_cmdshell 'del c:\delete-me-file.txt'

Since the xp_cmdshell extended procedure has not been enabled yet, the SQL Server will return the following error and warning message:


Msg 15281, Level 16, State 1, Procedure xp_cmdshell, Line 1SQL Server blocked access to procedure 'sys.xp_cmdshell' of component 'xp_cmdshell' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'xp_cmdshell' by using sp_configure. For more information about enabling 'xp_cmdshell', see "Surface Area Configuration" in SQL Server Books Online.
Actually the warning message is self explaining in details. We can either enable the sys.xp_cmdshell procedure by using sp_configure or by using the SQL Server Surface Area Configuration Tool.Of course in order to make configuration changes on the sql server, you have to have the system administrator rights or permissions on the SQL Server instance.

Enable xp_cmdshell using sp_configure:
exec sp_configure
go
exec sp_configure 'xp_cmdshell', 1
-- Configuration option 'xp_cmdshell' changed from 0 to 1. Run the RECONFIGURE statement to install.
go
reconfigure
go
When you run the exec sp_configure t-sql command, you will see a returned list of configuration values for the related SQL Server installation.The last row of the configuration list is probably displaying values of xp_cmdshell.config_value and run_value columns are displaying if xp_cmdshell is enabled and if this change is reflected to the running configuration values.exec sp_configure 'xp_cmdshell', 1 sql command sets the configuration value ("config_value") to 1 which means enable.And the last t-sql code or command reconfigure sets the running value to enabled in a way reflects the changes to the running server configurations.

How to delete a file using xp_cmdshell extended procedure:


After enabling xp_cmdshell it is straight forward for deleting a file from the file system if you have the necessary permissions on the file object for the user who is running the xp_cmdshell procedure.

xp_cmdshell 'del c:\delete-me-file.txt'

Sysname

About Sysname:

The sysname data type is used for table columns, variables, and stored procedure parameters that store object names. The exact definition of sysname is related to the rules for identifiers. Therefore, it can vary between instances of SQL Server. sysname is functionally the same as nvarchar(128) except that, by default, sysname is NOT NULL. In earlier versions of SQL Server, sysname is defined as varchar(30).

Example:

declare @path sysname