Reputation: 93
I want to retrieve duplicate records using talend integration open studio
. Example Records are:
id name
1 suresh
2 ramesh
3 nagesh
4 suresh
Could anyone please answer for above queston
expected results are:
id name
1 suresh
4 suresh
Thanks for advance
Upvotes: 2
Views: 4528
Reputation: 93
Finally i have found the duplicate records.I have used bellow rules.
enter image description here
first need to map deliminator file to tuniqrow after that map duplicate rows from tuniqrow to taggretaterow.in taggregaterow grouping the id.after that map to the tmap.
in tmap i have joined id==id and make sure as inner join.
Example Join condition
Upvotes: 1
Reputation: 416
Until tUniqueRow duplicates doesn't work properly you can use a trick. I split your task into two steps.
Firstly you need to get names that are duplicated. You can do this by using tAggregateRow component. Group by name, and count number of ids. Then after filter count>1 you can save these results in tHashOutput. tHashOutput saves results in memory and it is possible to use them later.
In second step read your data again and using tMap match them with results saved in HashOutput. If you use Join Model = Inner Join them in tMap output you'll get only these names that exists in saved duplicates.
Upvotes: 3