Comparing the Contents of Two Tables:
I have two tables named M and N. They have identical columns and have the same number of rows via select count(*) from M and from N. However, the content in one of the rows is different, as shown in the following query:
SQL> select * from M where C1=1;
C1 C2 C3
-- ------------ ----
1 AAAAAAAAAAAA 100
SQL> select * from N where C1=1;
C1 C2 C3
--- ------------ --------
1 AAAAAAAAAAAB 100
The only difference is the last character in column C2. It is an A in table M and a B in table N. I would like to write SQL to compare or see if tables M and N are in sync with respect to their content rather than the number of rows, but I don't know how to do it.
OK, we'll do the specific solution to this problem with columns C1, C2, and C3, and then we'll see how to generalize this to any number of columns. The first and immediate answer we go to was this:
(select 'A', a.* from M
MINUS
select 'A', b.* from N)
UNION ALL
(select 'B', b.* from b
MINUS
select 'B', a.* from a);
That is, just take M minus N (which gives us everything in M that's not in N) and add to that (UNION ALL) the result of N minus M. In fact, that is correct, but it has a couple of drawbacks:
• The query requires four full table scans.
• If a row is duplicated in M, then MINUS will "de-dup" it silently (and do the same with N).
So, this solution would be slow and also hide information from us. There is a better way, however, that uses just two full scans and GROUP BY. Consider these values in A and B:
create table a
(
c1 int,
c2 varchar(3),
c3 varchar(3)
);
create table b
(
c1 int,
c2 varchar(3),
c3 varchar(3)
);
insert into a values(1,'x','y');
insert into a values(2,'xx','y');
insert into a values(3,'x','y');
insert into b values(1,'x','y');
insert into b values(2,'x','y');
insert into b values(3,'x','yy');
select * from a;
C1 C2 C3
---- -- --
1 x y
2 xx y
3 x y
select * from b;
C1 C2 C3
--- -- --
1 x y
2 x y
3 x yy
The first rows are the same, but the second and third rows differ. This is how we can find them:
select c1, c2, c3,
count(src1) CNT1,
count(src2) CNT2
from
( select a.*,
1 src1,
to_number(null) src2
from a
union all
select b.*,
to_number(null) src1,
2 src2
from b
)
group by c1,c2,c3
having count(src1) <> count(src2);C1 C2 C3 CNT1 CNT2
--- -- -- ---- ----
2 x y 0 1
2 xx y 1 0
3 x y 1 0
3 x yy 0 1
Now, because COUNT(
The below having the script in SQL Server:
http://karthikeyanbaskaransqlserver.blogspot.com/2010/02/comparing-contents-of-two-tables.html
Tuesday, February 9, 2010
Comparing the Contents of Two Tables
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment