Jamie Alford
Jamie Alford

Reputation: 95

Importing CSV in Talend double quote delimited column is ignored

I have a CSV file with a double quote delimited timestamp and an email field, e.g.

Timestamp,Email
"2017-01-01 00:00:01",[email protected]
"2017-01-01 00:02:31",[email protected]

I have defined a metadata source for the CSV file and it was correctly able to identify and type the two columns. When I execute the package however, it treats the timestamp column as though it doesn't exist (usually I get an error 'Unparseable date: "[email protected]"')

I have tried altering the tFileInputDelimited with a number of settings, including the escape and text enclosure options and importing the timestamp as both a date and string (If I import it as a string, the timestamp field has the email address and the email address is blank), but I am unable to get the import to recognise the existence of the double quote delimited timestamp column.

I'm assuming that I have done something that is causing it to escape the whole timestamp value, but I can't think of what that might be.

Upvotes: 1

Views: 2082

Answers (3)

Jamie Alford
Jamie Alford

Reputation: 95

If you are using metadata, then:

  1. Ensure that the component is referring to the repository (Component -> Property Type = Repository)
  2. Modify the metadata to change the text enclosure character to "\""

Upvotes: 1

tobi6
tobi6

Reputation: 8239

If you can alter the input data, you should either enable quotes for all fields or for none.

If this is no option, you could also read the file with tFileInputFullRow, remove the quotes with a String replace maybe and process the data afterwards with tDenormalize into column data.

Upvotes: 2

Corentin
Corentin

Reputation: 2552

If you really want to keep the double quote around the timestamp in your input file, try this date model

"\"yyyy-MM-dd HH:mm:ss\""

This way, you specify that you need double quotes (\") in the input string.

Upvotes: 2

Related Questions