Eliminating duplicate records from a query set which have same ROWID
(March 31, 2007)
you might have records in a table that have the same attributes, but different ROWID, and there are many solution for eliminating such duplicate rows. However, the situation here is a bit different because the row we have in question have the same ROWID!! You might think, how could this be possible, well easy, A UNION operation on the same table as shown below
Please follow the step
I am using the normal EMP table
SQL> SELECT EMPNO, ENAME, SAL, 'A' RANK FROM EMP WHERE DEPTNO=10
UNION
SELECT EMPNO, ENAME , SAL, 'B' RANK FROM EMP
WHERE SAL > 2500
EMPNO ENAME
SAL RANK
---------- ------------ -------- ----
222 ammar
A
7566 JONES
2975 B
7698 BLAKE
2850 B
7782 CLARK
2450 A
7788 SCOTT
3000 B
7839
KING 5000
A
7839 KING
5000 B
7902 FORD
3000 B
7934 MILLER
1300 A
9 rows selected.
Note that the record 7839 appears twice because each record belongs to a different rank
STEP 2
Here i am just adding an outer select, effectively dealing with the Two unions as a view (in-line view in fact) and ordering the result by RANK
SQL> SELECT * FROM (
2 SELECT EMPNO, ENAME, SAL, 'A' RANK FROM EMP WHERE DEPTNO=10
3 UNION
4 SELECT EMPNO, ENAME , SAL, 'B' RANK FROM EMP WHERE SAL > 2500
5 ) ORDER BY RANK;
EMPNO ENAME
SAL RANK
---------- --------- ----------- ----
222 ammar
A
7782 CLARK
2450 A
7839
KING 5000
A
7934 MILLER
1300 A
7566 JONES
2975 B
7902 FORD
3000 B
7839
KING 5000
B
7788 SCOTT
3000 B
7698 BLAKE
2850 B
9 rows selected.
STEP 3
Now, I will add an analytical function called ROW_NUMBER, this is not a group function, rather it has a value for each row. Note that it appears as a column in the SELECT statement, I have given it an Alian (Serial). This function partitions the result of the union by EMPNO, and if the empno repeats, a new serial is given to it
SQL>SELECT EMPNO, ENAME, SAL, RANK , ROW_NUMBER() OVER(PARTITION BY EMPNO ORDER BY EMPNO) SERIAL FROM
( SELECT EMPNO, ENAME, SAL, 'A' RANK FROM EMP WHERE DEPTNO=10
UNION
SELECT EMPNO, ENAME , SAL, 'B' RANK FROM EMP WHERE SAL > 2500
)
ORDER BY RANK
EMPNO ENAME
SAL RANK
Serial
---------- --------- ----------- ---- ------
222 ammar
A 1
7782 CLARK
2450 A 1
7839
KING 5000
A 1
7934 MILLER
1300 A 1
7566 JONES
2975 B 1
7902 FORD
3000 B 1
7839
KING 5000
B 2
7788 SCOTT
3000 B 1
7698 BLAKE
2850 B 1
9 rows selected.
NOTE, I JUST MODIFIED THE MAIN SELECT AND ADDED EMPNO, ENAME, SAL, RANK AND
ROW_NUMBER FUNCTION OVER()
STEP 4
NOW ELIMINATE ANY SERIAL WITH VALUE MORE THAN 1 BY ADDING ANOTHER OUTER SELECT
and where condition at the end
SQL>SELECT * FROM
(
SELECT EMPNO, ENAME, SAL, RANK , ROW_NUMBER() OVER(PARTITION BY EMPNO
ORDER BY EMPNO) SERIAL FROM
( SELECT EMPNO, ENAME, SAL, 'A' RANK FROM EMP WHERE DEPTNO=10
UNION
SELECT EMPNO, ENAME , SAL, 'B' RANK FROM EMP WHERE SAL > 2500
)
)AA
WHERE AA.SERIAL = 1 ORDER BY RANK
EMPNO
ENAME SAL
RANK Serial
---------- --------- ----------- ---- ------
222 ammar
A 1
7782 CLARK
2450 A 1
7839
KING 5000
A 1
7934 MILLER
1300 A 1
7566 JONES
2975 B 1
7902 FORD
3000 B 1
7788 SCOTT
3000 B 1
7698 BLAKE
2850 B 1
8 rows selected.
The duplicate record disappeared