Reputation: 59
I am reading a CSV file from a remote location(ftp) and file has an invalid number of columns.
Steam is not progressing when such rows encountered in the file. I need to skip them with an error message and proceed.
Here is what I have tried, Supervision strategy is not working.
source.via(CsvParsing.lineScanner() .withAttributes(ActorAttributes.supervisionStrategy(throwable -> Supervision.resume())))
I need to skip invalid row with an error message and proceed.
Sample Data: My Csv has 5 fields in each row.
1281,Export - Product Search Tags,0,Id,20
1282,Export - Product Search Tags,1,Id,10
1283,Export - Product Search Tags,2,Value,100
If I remove the last field in the 2nd row (i.e. 10). Then the stream will fail, it won't read the next line.
Upvotes: 0
Views: 620
Reputation: 718
CsvParsing doesn't seem to have support for Actor supervision.
This is how the data is structured, see docs:
Field Delimiter - separates the columns from each other (e.g. , or ;)
Quote - marks columns that may contain other structuring characters (such as Field Delimiters or line break) (e.g. ")
Escape Character - used to escape Field Delimiters in columns (e.g. \)
Lines are separated by either Line Feed (\n = ASCII 10) or Carriage Return and Line Feed (\r = ASCII 13 + \n = ASCII 10).
In this case we expect the stream to fail with MalformedCsvException in the following cases, see internal CsvParser.scala class
1. Wrong escaping
2. No line end after the delimiter or when maximum line end is reached
3. No delimiter or end of line when quote end is reached
4. Unclosed quote
Please check when column is removed that none of these conditions are violated, and add the error the stream fails with to make the question more descriptive.
Upvotes: 2