PL/SQL: best way to find and remove duplicate rows ( keep 1 row ) 
Author Message
 PL/SQL: best way to find and remove duplicate rows ( keep 1 row )
Hi all,

I would like a script which would find all the duplicate rows in the table
and keep 1 of the duplicate rows and delete the others.

Any suggestions.

Regards, Paul



Thu, 24 Jan 2002 03:00:00 GMT
 PL/SQL: best way to find and remove duplicate rows ( keep 1 row )

For example

Create table test as select distinct * from your_origin_table;

kalle

Quote:
> Hi all,

> I would like a script which would find all the duplicate rows in the table
> and keep 1 of the duplicate rows and delete the others.

> Any suggestions.

> Regards, Paul



Fri, 25 Jan 2002 03:00:00 GMT
 PL/SQL: best way to find and remove duplicate rows ( keep 1 row )
Declare
    Cursor C1 is
    select distinct key,rowid
     from mytable;
Begin
    for i in C1 loop
        delete mytable
        where key = i.key
             and rowid != i.rowid;
    End loop
End;
Quote:

> Hi all,

> I would like a script which would find all the duplicate rows in the table
> and keep 1 of the duplicate rows and delete the others.

> Any suggestions.

> Regards, Paul



Fri, 25 Jan 2002 03:00:00 GMT
 PL/SQL: best way to find and remove duplicate rows ( keep 1 row )

(if that email address didn't require changing)

Quote:

>Declare
>    Cursor C1 is
>    select distinct key,rowid
>     from mytable;
>Begin
>    for i in C1 loop
>        delete mytable
>        where key = i.key
>             and rowid != i.rowid;
>    End loop
>End;

that deletes ALL duplicates - it does not leave one behind.

consider:

SQL> create table t ( x int );
Table created.

SQL> insert into t values ( 1 );
SQL> insert into t values ( 1 );
SQL> insert into t values ( 2 );
SQL> commit;
Commit complete.

SQL> Declare
  2      Cursor C1 is
  3      select distinct x,rowid
  4       from T;
  5  Begin
  6      for i in C1 loop
  7          delete from t
  8          where x = i.x
  9               and rowid != i.rowid;
 10      End loop;
 11  End;
 12  /

PL/SQL procedure successfully completed.

SQL>
SQL> select * from t;

         X
----------
         2

You don't need plsql to do this, it would just be something like:

SQL> delete from T
  2   where rowid <> ( select max(rowid)
  3                      from T t2
  4                     where t2.x = t.x )
  5  /

1 row deleted.

SQL> select * from t;

         X
----------
         1
         2

that picks a 'random' duplicate (the one with the max rowid) to keep and deletes
the rest of them.

Quote:

>> Hi all,

>> I would like a script which would find all the duplicate rows in the table
>> and keep 1 of the duplicate rows and delete the others.

>> Any suggestions.

>> Regards, Paul

--
See http://govt.us.oracle.com/~tkyte/ for my columns 'Digging-in to Oracle8i'...
Current article is "Part I of V, Autonomous Transactions" updated June 21'st


Oracle Service Industries     Reston, VA   USA

Opinions are mine and do not necessarily reflect those of Oracle Corporation



Fri, 25 Jan 2002 03:00:00 GMT
 PL/SQL: best way to find and remove duplicate rows ( keep 1 row )
Thomas and Ken,

Thanks so much for replying.

I took your fancy correlated subquery concept and applied it to
also show all duplicates in a table first.

create table my_table1
 ( my_table char(30), my_count integer );

-- Remove duplicates keeping one dup with max(rowid)
delete from my_table1 t
   where rowid <> ( select max(rowid)
                      from my_table1 t2
                     where t2.my_table = t.my_table );

-- Find and Display Each duplicate row
select a.*, rowid  from my_table1 a
 where a.my_table = ( select b.my_table  from
    my_table1 b
    where b.my_table = a.my_table
    group by b.my_table
    having count(*) > 1 )
 order by my_table;

Regards, Paul


Quote:

> (if that email address didn't require changing)

> >Declare
> >    Cursor C1 is
> >    select distinct key,rowid
> >     from mytable;
> >Begin
> >    for i in C1 loop
> >        delete mytable
> >        where key = i.key
> >             and rowid != i.rowid;
> >    End loop
> >End;

> that deletes ALL duplicates - it does not leave one behind.

> consider:

> SQL> create table t ( x int );
> Table created.

> SQL> insert into t values ( 1 );
> SQL> insert into t values ( 1 );
> SQL> insert into t values ( 2 );
> SQL> commit;
> Commit complete.

> SQL> Declare
>   2      Cursor C1 is
>   3      select distinct x,rowid
>   4       from T;
>   5  Begin
>   6      for i in C1 loop
>   7          delete from t
>   8          where x = i.x
>   9               and rowid != i.rowid;
>  10      End loop;
>  11  End;
>  12  /

> PL/SQL procedure successfully completed.

> SQL>
> SQL> select * from t;

>          X
> ----------
>          2

> You don't need plsql to do this, it would just be something like:

> SQL> delete from T
>   2   where rowid <> ( select max(rowid)
>   3                      from T t2
>   4                     where t2.x = t.x )
>   5  /

> 1 row deleted.

> SQL> select * from t;

>          X
> ----------
>          1
>          2

> that picks a 'random' duplicate (the one with the max rowid) to keep and
deletes
> the rest of them.


> >> Hi all,

> >> I would like a script which would find all the duplicate rows in the
table
> >> and keep 1 of the duplicate rows and delete the others.

> >> Any suggestions.

> >> Regards, Paul

> --
> See http://govt.us.oracle.com/~tkyte/ for my columns 'Digging-in to
Oracle8i'...
> Current article is "Part I of V, Autonomous Transactions" updated June
21'st


> Oracle Service Industries     Reston, VA   USA

> Opinions are mine and do not necessarily reflect those of Oracle

Corporation


Fri, 25 Jan 2002 03:00:00 GMT
 PL/SQL: best way to find and remove duplicate rows ( keep 1 row )
Try this:

Let the primary keys(or duplicate keys) be col1, col2 of table1

delete from table1 A
where A.rowid > (select min(rowid) from table1 B
                 where A.col1 = B.col1
                 and  A.col2= B.col2)

Alternatively,
delete from table1 A
where A.rowid < (select max(rowid) from table1 B
                 where A.col1 = B.col1
                 and  A.col2= B.col2)
would work too

Chandu

delete from table



Quote:
> Thomas and Ken,

> Thanks so much for replying.

> I took your fancy correlated subquery concept and applied it to
> also show all duplicates in a table first.

> create table my_table1
>  ( my_table char(30), my_count integer );

> -- Remove duplicates keeping one dup with max(rowid)
> delete from my_table1 t
>    where rowid <> ( select max(rowid)
>                       from my_table1 t2
>                      where t2.my_table = t.my_table );

> -- Find and Display Each duplicate row
> select a.*, rowid  from my_table1 a
>  where a.my_table = ( select b.my_table  from
>     my_table1 b
>     where b.my_table = a.my_table
>     group by b.my_table
>     having count(*) > 1 )
>  order by my_table;

> Regards, Paul



> > A copy of this was sent to Kenneth C Stahl


- Show quoted text -

Quote:
> > (if that email address didn't require changing)

> > >Declare
> > >    Cursor C1 is
> > >    select distinct key,rowid
> > >     from mytable;
> > >Begin
> > >    for i in C1 loop
> > >        delete mytable
> > >        where key = i.key
> > >             and rowid != i.rowid;
> > >    End loop
> > >End;

> > that deletes ALL duplicates - it does not leave one behind.

> > consider:

> > SQL> create table t ( x int );
> > Table created.

> > SQL> insert into t values ( 1 );
> > SQL> insert into t values ( 1 );
> > SQL> insert into t values ( 2 );
> > SQL> commit;
> > Commit complete.

> > SQL> Declare
> >   2      Cursor C1 is
> >   3      select distinct x,rowid
> >   4       from T;
> >   5  Begin
> >   6      for i in C1 loop
> >   7          delete from t
> >   8          where x = i.x
> >   9               and rowid != i.rowid;
> >  10      End loop;
> >  11  End;
> >  12  /

> > PL/SQL procedure successfully completed.

> > SQL>
> > SQL> select * from t;

> >          X
> > ----------
> >          2

> > You don't need plsql to do this, it would just be something like:

> > SQL> delete from T
> >   2   where rowid <> ( select max(rowid)
> >   3                      from T t2
> >   4                     where t2.x = t.x )
> >   5  /

> > 1 row deleted.

> > SQL> select * from t;

> >          X
> > ----------
> >          1
> >          2

> > that picks a 'random' duplicate (the one with the max rowid) to keep
and
> deletes
> > the rest of them.


> > >> Hi all,

> > >> I would like a script which would find all the duplicate rows in
the
> table
> > >> and keep 1 of the duplicate rows and delete the others.

> > >> Any suggestions.

> > >> Regards, Paul

> > --
> > See http://govt.us.oracle.com/~tkyte/ for my columns 'Digging-in to
> Oracle8i'...
> > Current article is "Part I of V, Autonomous Transactions" updated
June
> 21'st


> > Oracle Service Industries     Reston, VA   USA

> > Opinions are mine and do not necessarily reflect those of Oracle
> Corporation

Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.


Sat, 26 Jan 2002 03:00:00 GMT
 
 [ 6 post ] 

 Relevant Pages 

1. TSQL Help on removing duplicate rows except first and last row

2. SQL statment to remove duplicate rows?

3. Best T-SQL method to remove rows

4. sql to find selective duplicate rows

5. REPOST: sql to find selective duplicate rows

6. SQL find duplicate rows sum values and merge

7. Removing Duplicate Rows?

8. Removing duplicate rows (sort of :)

9. Removing duplicate rows from a table

10. removing duplicate rows

11. Removing duplicate rows

12. Remove duplicates from rows


 
Powered by phpBB® Forum Software