Eliminating  duplicate records from a query set which have same ROWID

(March 31, 2007)

you might have records in a table that have the same attributes, but different ROWID,  and there are many solution for eliminating such duplicate rows.  However, the situation here is a bit different because the row we have in question have the same ROWID!!  You might think, how could this be possible,  well easy,  A UNION operation on the same table as shown below

Please follow the step

I am using the normal EMP table

SQL>  SELECT EMPNO, ENAME, SAL, 'A' RANK FROM EMP WHERE DEPTNO=10
      UNION
      SELECT EMPNO, ENAME , SAL, 'B' RANK FROM EMP WHERE SAL > 2500
 

EMPNO      ENAME         SAL         RANK
---------- ------------ --------     ----
222        ammar                     A
7566       JONES         2975        B
7698       BLAKE         2850        B
7782       CLARK         2450        A
7788       SCOTT         3000        B
7839       KING          5000        A
7839       KING          5000        B

7902       FORD          3000        B
7934       MILLER        1300        A
 

9 rows selected.

Note that the record 7839 appears twice because each record belongs to a different rank



STEP 2
 

Here i am just adding an outer select, effectively dealing with the Two unions as a view (in-line view in fact) and ordering the result by RANK


SQL> SELECT * FROM (
2 SELECT EMPNO, ENAME, SAL, 'A' RANK FROM EMP WHERE DEPTNO=10
3 UNION
4 SELECT EMPNO, ENAME , SAL, 'B' RANK FROM EMP WHERE SAL > 2500
5 ) ORDER BY RANK;

EMPNO        ENAME     SAL          RANK
----------   --------- -----------  ----
222          ammar                  A
7782         CLARK         2450     A
7839         KING          5000     A
7934         MILLER        1300     A
7566         JONES         2975     B
7902         FORD          3000     B
7839         KING          5000     B
7788         SCOTT         3000     B
7698         BLAKE         2850     B   

9 rows selected.

STEP 3

Now, I will add an analytical function called ROW_NUMBER, this is not a group function, rather it has a value for each row.  Note that it appears as a column in the SELECT statement, I have given it an Alian (Serial). This function partitions the result of the union by EMPNO, and if the empno repeats, a new serial is given to it

SQL>SELECT EMPNO, ENAME, SAL, RANK , ROW_NUMBER() OVER(PARTITION BY EMPNO     ORDER BY EMPNO) SERIAL FROM

  ( SELECT EMPNO, ENAME, SAL, 'A' RANK FROM EMP WHERE DEPTNO=10
   UNION
   SELECT EMPNO, ENAME , SAL, 'B' RANK FROM EMP WHERE SAL > 2500

   )

ORDER BY RANK


EMPNO        ENAME     SAL          RANK    Serial
----------   --------- -----------  ----    ------
222          ammar                  A        1
7782         CLARK         2450     A        1
7839         KING          5000     A        1
7934         MILLER        1300     A        1
7566         JONES         2975     B        1
7902         FORD          3000     B        1
7839         KING          5000     B        2
7788         SCOTT         3000     B        1
7698         BLAKE         2850     B        1

 

9 rows selected.

NOTE, I JUST MODIFIED THE MAIN SELECT AND ADDED EMPNO, ENAME, SAL, RANK AND ROW_NUMBER FUNCTION OVER()

STEP 4

NOW ELIMINATE ANY SERIAL WITH VALUE MORE THAN 1 BY ADDING ANOTHER OUTER SELECT and where condition at the end


SQL>SELECT * FROM

(
SELECT EMPNO, ENAME, SAL, RANK , ROW_NUMBER() OVER(PARTITION BY EMPNO     ORDER BY EMPNO) SERIAL FROM

  ( SELECT EMPNO, ENAME, SAL, 'A' RANK FROM EMP WHERE DEPTNO=10
   UNION
   SELECT EMPNO, ENAME , SAL, 'B' RANK FROM EMP WHERE SAL > 2500

   )  

)AA
WHERE AA.SERIAL = 1 ORDER BY RANK

 

EMPNO        ENAME     SAL          RANK    Serial
----------   --------- -----------  ----    ------
222          ammar                  A        1
7782         CLARK         2450     A        1
7839         KING          5000     A        1
7934         MILLER        1300     A        1
7566         JONES         2975     B        1
7902         FORD          3000     B        1
7788         SCOTT         3000     B        1
7698         BLAKE         2850     B        1 

 

8 rows selected.

 

The duplicate record disappeared