Reputation: 132
I am using TalenD studio to merge about 80 log files into 1 giant file. The files are just standard txt files. I currently have a job set up to merge all the files together (they use the same headers and formatting), but my issue is the following.
The first column contains the users login id, if the user is running off of the server this is captured in the log, if they are running locally it is not. What I need to do is when the login id is Null/Blank, to find the login from the file path that is located in column 4.
The path is set up as eitehr C:\Documents and Settings(login id here).... or C:\Users(login id here).... or C:\DOCUME~1(login id here)... So it is always in between the 2nd set of backslashes. However, I am new to TalenD and am not sure what to put in the expression to pull this data out and put it in the login id field.
If anyone has a way of doing this, or can lead me in the right direction it would be very helpful!
Upvotes: 1
Views: 951
Reputation: 1011
you have to write java code in your tmap variable or expression example as below (should be modified to fit your requirement)
String strtemp = "C:\\document settings\\userid";
System.out.println(strtemp.substring(strtemp.lastIndexOf("\\")+1));
Upvotes: 0
Reputation: 56849
You can use a tExtractRegexFields
component to extract the login ID from your filepath and then conditionally map this to the login ID column in a tMap
if the login ID field is null or blank.
A typical job to do this might look like:
This has an input of data (in this case a tFixedFlowInput
component to hard code the values to the job), a tExtractRegexFields
component to extract the login ID from the filepath column and then a tMap
component to map the data conditionally.
The values in the above tFixedFlowInput
component have a combination of the instances you see in your log and also show a login Id that is different to the one in your filepath so you can see that you won't always overwrite your login Id and only use the one in the filepath where necessary.
After this we need to configure the tExtractRegexField
to look into the filepath column and attempt to find capture groups. I used the regex "^C:\\\\(?:Users|DOCUME~1|Documents and Settings)\\\\(\\w+)\\\\"
which will capture any "word-like" characters up until a back slash occurs. You may have to tweak this to get the right results for your users. The schema for the tExtractRegexField
component also requires you to add extra columns for each capture group (which is also why I made the alternating group a passive group) and it will fill these sequentially. So if you have 3 capture groups in your regex but only 2 extra fields then only the first 2 capture groups will be used.
Finally, we use some simple logic in our tMap
component to use the extracted login ID where necessary:
Here we define a boolean variable that tests whether the login field is null or blank and if so we use the previously defined regexLogin value, otherwise we use the original login value.
And here's the result:
Notice how we successfully grab the user Ids from the 3 null or blank user Id entries and also how we defer to the original login Id when there's a clash between the login Id and the one we extracted from the filepath.
Upvotes: 2