Friday, May 18, 2012

How to remove duplicates without using "Remove Duplicates" stage in DataStage

Step 1:Use Copy stage and use hash partitioning

Step 2:mark option perform sort to get sorted records


Step3: mark option stable( to get first value as distinct value in case of duplicates)

Step 4:mark option unique to get DISTINCT records

Step 5:select key fields for sorting and partitioning usage.

No comments:

Post a Comment