Friday, May 18, 2012
Thursday, May 17, 2012
Sort and Remove Duplicates stages in DataStage
Step 1: Drag all the stages.here in example we have chosen a sequential file(.csv file to be specific) as the source and a complex flat file as the destination file
Input data
Step 2: the records in source are sorted in sort stage.Make sure to choose a key in order to sort the records based on the key field.In the example, the key field is chosen as shipdate.
Step 3: In the Remove duplicate stage also we need to specify the key (shipdate chosen in the example) based on which it will search for the duplicate record(s) and remove them.Hence the output will contain only disctinct records based on the key specified.
Step 2: the records in source are sorted in sort stage.Make sure to choose a key in order to sort the records based on the key field.In the example, the key field is chosen as shipdate.
Step 3: In the Remove duplicate stage also we need to specify the key (shipdate chosen in the example) based on which it will search for the duplicate record(s) and remove them.Hence the output will contain only disctinct records based on the key specified.
Output file will be
Hence we get distcinct records with each of them having unique shipdate value.
Tuesday, May 15, 2012
Using Column Export stage in DataStage
The Column Export stage is a restructure stage. It can have a single input link, a single output link and a single rejects link.
The Column Export stage exports data from a number of columns of different data types into a single column of data type ustring, string, or binary. It is the complementary stage to Column Import
Monday, May 14, 2012
Using Pivot in DataStage
Here the source is taken as an excel file(.csv extension) and the data contains
Name Qty1 Qty2 Qty3
abc 10 20 30
def 70 60 50
After pivot stage (as described in figure) the derivation done is as follows in output column
Derivation
Name Name
Qty Qty1 Qty2 Qty3
(please make sure the datatype should be same in derivation column)
hence,The destination (flat file in thi example) will be
Name Qty
abc 10
abc 20
abc 30
def 70
def 60
def 50
what is the main difference between lookup ,join and merge in DataStage
LOOK UP STAGE:
* One output link and one reject link.
* Faster if the volume of data is less than the memory size.
* If the size is too huge then DS uses scrap files to store the
remaining data and that hinders the performance.
JOIN STAGE:
* Can have any no. of input links
* Does not use memory, look up is done on table level,
so would be better to use if the data volume is pretty huge.
* Doesn't have any reject link.
* Only one output link.
MERGE STAGE:
*Can have any number of input link and update
links.
* Need to have a key column.
* Can have any number of reject link.
* Only one output link.
Subscribe to:
Posts (Atom)