vishal kadu
vishal kadu

Reputation: 11

Pentaho Kettle program in java to merge multiple csv files by columns

I have two csv files employee.csv and loan.csv.

In employee.csv I have four columns i.e. empid(Integer),name(String),age(Integer),education(String).

In loan.csv I have three columns i.e. loan(Double),balance(Double),empid(Integer).

Now, I want to merge these two csv files into a single csv file by empid column.So in the result.csv file the columns should be,

Also I have to achieve this only by using kettle api program in Java. Can anyone please help me?

Upvotes: 0

Views: 743

Answers (1)

Rishu S
Rishu S

Reputation: 3968

First of all, you need to create a kettle transformation as below:

  1. Take two "CSV Input Step", one for employee.csv and another for loan.csv
  2. Hop the input to the "Stream Lookup" step and lookup using the "emplid"
  3. Final step : Take a Text file output to generate a csv file output. enter image description here

I have placed the ktr code in here.

Secondly, if you want to execute this transformation using Java, i suggest you read this blog. I have explained how to execute a .ktr/.kjb file using Java.


Extra points:

If its required that the names of the csv files need to be passed as a parameter from the Java code, you can do that by adding the below code:

  trans.setParameterValue(parameterName, parameterValue);

where parameterName is the some variable name and parameterValue is the name of the file or the location.

I have already taken the files names as the parameter in the kettle code i have shared.

Hope it helps :)

Upvotes: 1

Related Questions