Understanding of Transformer Looping & Caching
*************************************************************
Understanding of Transformer Looping & Caching:-
Power of
transformer looping is introduced from IIS Data stage 8.5 and above versions.
Transformer looping is made developers life easy in developing complex
scenarios which was almost difficult in earlier versions.
loop
variables:-
You can use loop variables when a loop condition is defined for
the Transformer stage. When a loop is defined, the Transformer stage can output
multiple rows for every row input to the stage. Loop variables are evaluated
every time that the loop is iterated, and so can change their value for every
output row. Such variables are accessible only from the Transformer stage in
which they are declared. You cannot use a loop variable in a stage variable
derivation.
Loop variables can be used as follows:
·
They can be assigned
values by expressions.
·
They can be used in
expressions which define an output column derivation.
·
Expressions evaluating a
variable can include other loop variables or stage variables or the variable being evaluated
itself.
@ITERATION:
- System variable used for transformer looping
What is Caching
in the Transformer?
The Transformer
Cache is an in-memory storage mechanism that is available from within the
Transformer stage and is used to help solve complex data integration scenarios.
The cache is a first-in/first-out (i.e. FIFO) construct and is accessible to
the developer via two new functions:
• SaveInputRecord: stores an input row
to back of the cache
• GetInputRecord: retrieves a saved row
from the front of the cache
I will try to explain
the transformer caching and looping using below scenario:-
Below is my input data:-
Customer
|
Store
|
Products
|
Value
|
600785011
|
103
|
10
|
100
|
600785011
|
104
|
20
|
200
|
600785011
|
109
|
30
|
300
|
Output:-
For each customer get a
list of the stores he shopped, with this list output every combination of
stores dropping customer from the output.
Home Store
|
Products
|
Value
|
Away Store
|
Products
|
Value
|
103
|
10
|
100
|
103
|
10
|
100
|
103
|
10
|
100
|
104
|
20
|
200
|
103
|
10
|
100
|
109
|
30
|
300
|
104
|
20
|
200
|
103
|
10
|
100
|
104
|
20
|
200
|
104
|
20
|
200
|
104
|
20
|
200
|
109
|
30
|
300
|
109
|
30
|
300
|
103
|
10
|
100
|
109
|
30
|
300
|
104
|
20
|
200
|
109
|
30
|
300
|
109
|
30
|
300
|
Below is Job Design:-
SeqFile ------->
TRFM1-------> TRFM2--------->TempDataset
TSFM1:-
Define below stage
variables in transformer:-
NumSavedRows=
SaveInputRecord
IsBreak=
LastRowInGroup(to_tsfm1.Customer)
svCurrStore= If IsBreak
Then to_tsfm1.Store:"|": svPrevStore Else to_tsfm1.Store
svPrevStore= If IsBreak
Then to_tsfm1.Products else to_tsfm1.Products
:"|": svPrevProduct svCurrProduct= If IsBreak Then to_tsfm1.Products
:"|": svPrevProduct Else to_tsfm1.Products
svPrevProduct= If
IsBreak Then to_tsfm1.Products else
to_tsfm1.Products :"|": svPrevProduct
svCurrValue= If IsBreak
Then to_tsfm1.Value :"|": svPrevValue Else to_tsfm1.Values
svPrevValue= If IsBreak
Then to_tsfm1.Value else to_tsfm1.Value
:"|": svPrevValue
NumRows= If IsBreak Then
NumSavedRows Else 0
Define below loop
conditions & Derivation:-
@ITERATION <= NumRows
OutputRecord=
GetSavedInputRecord()
Output Derivations:-
Store=Inputlink.Store
Product=InputLink.Product
Value=InputLink.Value
ConcatStore=svCurrStore
ConcatProduct=svCurrProduct
ConcatValue=svCurrValue
Customer=InputCustomer
The logic of this
transformer is as follows:
1 input data is read and stored in the cache via the "SaveInputRecord" function.
2.key breaks are tested via the "LastRowInGroup.
3.svCurrStore, svCurrProduct, svCurrValue will concatenate with pipl all stores,Products & values in the same group.
4.if there is a key break, the " NumRows " variable is set to the number of records on the cache and we'll skip to step 6
5.if there is no key break(NumRows=0) the Loop Condition will be false. This will cause the next input record to be read and processed .
6.finally, the transformer will loop while "@ITERATION <= NumRows " and read the cache records.
TSFM2:-
Define below Stage Variable:-
RowNum=Count(InputLink1.ConcatStore,”|”)
Define below loop
conditions :-
@ITERATION <= RowNum
Output
Derivation:-
Store=InputLink1.Store
Product=InputLink.Product
Value=InputLink.Value
AwayStore=
Field(to_tsfm2.ConcatStore,"|",@ITERATION)
Product=
Field(to_tsfm2.ConcatProducts,"|",@ITERATION)
Value=
Field(to_tsfm2.ConcatValue,"|",@ITERATION)
No comments:
Post a Comment