getting rid of duplicates 
Author Message
 getting rid of duplicates

I have about 3.9 million records, about 40% are duplicates.  I need to
remove all the duplicates.  Will this code perform ok to do the job?  Any
better method?  I dont have a primary key yet but will build one after this
process.  I could always create one temporarily.

Thanks,Adrian

begin tran deduplicate
    select DISTINCT *
    into #temp
    from geninfo

    truncate table geninfo

    insert geninfo
    select *
    from #temp

    --select * from geninfo

    drop table #temp
commit tran deduplicate



Sun, 11 Sep 2005 19:37:38 GMT
 getting rid of duplicates

I would suggest you to truncate the table after you are
seeing desired result set and then only commit the
transaction.
Rest everything looks ok. If you are looking for a primary
key with with an identity column. then you also have a
option
to generate the identity keys while running the "SELECT
INTO..." statement
as follows.

select *, IDENTITY(int, 1,1) pk_id
into #temp from
(select DISTINCT *
   from geninfo) A

Your temporary table will have an additional column as
PK_ID, hence you have to add a column to GENINFO table so
that
these ID values will be populated with the INSERT
statement.

--Vishal

Quote:
>-----Original Message-----
>I have about 3.9 million records, about 40% are

duplicates.  I need to
Quote:
>remove all the duplicates.  Will this code perform ok to
do the job?  Any
>better method?  I dont have a primary key yet but will

build one after this
Quote:
>process.  I could always create one temporarily.

>Thanks,Adrian

>begin tran deduplicate
>    select DISTINCT *
>    into #temp
>    from geninfo

>    truncate table geninfo

>    insert geninfo
>    select *
>    from #temp

>    --select * from geninfo

>    drop table #temp
>commit tran deduplicate

>.



Sun, 11 Sep 2005 20:13:17 GMT
 
 [ 2 post ] 

 Relevant Pages 

1. getting rid of duplicates

2. GETTING RID OF DUPLICATE COLUMNS..

3. Getting rid of duplicate records

4. Getting rid of duplicate records.

5. Getting Rid of Duplicate Data

6. Getting rid of the IDENTITY property on a column withou getting rid of the column

7. I am getting duplicate records, I mean everything is duplicate even Access record number

8. Get rid of Duplicate Data

9. How do you get rid of duplicate records?

10. How to get rid of duplicated rows?

11. How get rid of duplicate rows ?


 
Powered by phpBB® Forum Software