Reputation: 763
I have created a simple job in Talend that will perform an inner join in the data between 2 excel sheets and then dump the result in an output excel sheet. This can be best illustrated by the below diagram :-
The mapping used in tMap is :-
However the additional challenge for me now is that I have to perform this mapping only if the column value in that row is not NULL. eg there is a mapping row1.RECID = row2.RECID, but this should only be legal if row2.RECID is not NULL.
How do I achieve this in Talend? I have experimented a lot with tMap expressions but can't get it right..
Here is a small sample input and it's corresponding expected output.
Suppose my input has values :-
v1, v2,v3,v4
1 , A, O, 3
2, B, X, 4
3, C, X, 4
and lookup has values
v1, v2, v3
1, A, O, 3
2, null, X, 4
3, null, C, 4
2,null,X,null
Then the output should be :-
v1,v2,v3
1,A,O,3
2,B,X,4
2,B,X,4
Upvotes: 0
Views: 7658
Reputation: 1134
Talend does have a more elegant option that will allow the filtering of your data on multiple columns. Use the tSchemaComplianceCheck component where filtering out nulls and empty is as simple as clicking a couple of check boxes. This allows you to use your own schema to check against nulls and empty values and filter them out. The error rows go to a reject flow which you have the option of processing. If you do not wish to capture and process the rejects you can simply ignore them. Your main flow will only have the records that passed the compliance check. Here are some tips on using it:
In the tSchemaComplianceCheck component -->Basic Settings Screen click Custom Defined and it will show you each column. Make sure Nullable is unchecked or else it will allow nulls to pass thru.
In the Advanced Settings tab check Treat all empty string as NUll. This will work in conjunction with the prior step to filter out both null and empty.
In your Excel component, click Advances Settings tab, and check Stop reading on encountering empty rows.
below is a screen shot which shows the basic flow and setting. You would link to a tMap instead of the tLogRow. If I have understood your problem correctly I think you will find this is the ideal solution in Talend.
Upvotes: 0
Reputation: 3973
Before joining your input flows, you have to reject rows with null values, I have created a mapping based on the given simple data.
Upvotes: 0
Reputation: 262
Try to map the maximum of values from row1, the put row2 with left outer join.
I you want values which are only in row1 and row2, you can add a filter in row2 for that (but I guess that this is not what you want)
Upvotes: 0