Pentaho - PDI: get fields of stream

Pretty simple question here: If I read a .csv file for example, how can I know at runtime what columns that file has? I want to convert that .csv file to JSON, but I don't know how could I set the fields for the JSON Output step dynamically, to include all the rows of that file. Can you help me expand my knowledge?

Thanks in advance

Upvotes: 1

Views: 1666

Answers (1)

eicherjc
eicherjc

Reputation: 301

This is definitely a good use case for metadata injection. The step specifically is called ETL Metadata Injection. You'll need to get the fields dynamically probably using a scripting step (there's Java, JavaScript, and Python scripting steps available, as well as R if you're an Enterprise customer). I don't think that there is a built in step that will dynamically discover the fields at runtime.

Once you have fields, you can use the metadata injection step to inject the field names into CSV Input or Text File Input Step, as well as the JSON Output step.

Here is the official help documentation on the ETL Metadata Injection step: https://help.pentaho.com/Documentation/8.1/Products/Data_Integration/Transformation_Step_Reference/ETL_Metadata_Injection

Upvotes: 1

Related Questions