Reputation: 43
I have a csv file like this-
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"[email protected]"^"Male"
"2"^"Willaim"^"Khan"^"[email protected]"^"Male"
If i execute any drill query on this then i get the following error-
UserRemoteException : DATA_READ ERROR: Unexpected character '101' following quoted value of CSV field. Expecting '94'. Cannot parse CSV input."
But with a csv like this-
^id^|^first_name^|^last_name^|^email^|^gender^
^1^|^John^|^Bharadwaj \\^|^[email protected]^|^Male^
^2^|^Willaim^|^Khan^|^[email protected]^|^Male^
Everything works fine.
This is my dfs configuration for csv in apache drill.I am using the version 1.21.1-
"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", "fieldDelimiter": "^", "quote": """, "escape": "\", "comment": "#", "extractHeader": true }
It seems there is some issue when escape is present just immediately before quote. I tried changing the escape value to ~ and observed the same issue. Any insights?
Upvotes: 0
Views: 56
Reputation: 1389
It looks the Drill CSV parser will only escape quote characters using the configured escape character and, in particular, the escape character cannot escape an occurrence of itself. I'm not sure why this limitation exists.
Upvotes: 0