Reputation: 115
I'm creating a job in Talend where I have to generate files containing data generated with tRowGenerator along with other sources : SQL Server database and delimited files.
The issue is that I have duplicated files with the same primary key. All i want to get is 100 records(420 rows) : For each Random UUID generated i shall get 42 rows and so on, but instead i'm getting the same row 10 times(it's duplicated 10 times)
I'm getting data from 3 sources as shown below:
To get this fields in my output file:
Upvotes: 0
Views: 718
Reputation: 4061
If I understand correctly, you're using one of the functions in tRowGenerator
to get random data.
The problem is that the data generation functions available from Talend are not really random, they get their values from a predefined list of values. You can look at the source code to verify that they have a hundred or so value, so you're bound to get duplicates.
To get unique values create a Talend routine with a simple method that generates a UUID:
public class Utils {
/**
* getRandom: return a random UUID
*
*
* {talendTypes} String
*
* {Category} User Defined
*
* {param} string("world") input: dummy input
*
* {example} getRandom("world") # 01e98b98-05d6-427c-978d-1f86d0ea4712
*/
public static String getRandom(String input) {
return java.util.UUID.randomUUID().toString();
}
}
You can then access this function from tRowGenerator
:
One more thing, I'm not sure what exactly is your requirement, but since you don't have a join key between your inputs, you get are getting a cartesian join between all your inputs (42x298x206 rows). So you might want to define a join condition.
If you do define a join condition, make sure the tMap
inputs are in the right order (you are using the tRowGenerator
flow as a main connection, and others as lookup).
Upvotes: 1