Reputation: 65
I'm trying to store a document into Vespa with a string field. When using the document-api http endpoint it's getting rejected with a parsing error. I've validated that the correct JSON is being sent (other documents go through fine).
Here is the error message that I'm seeing:
PARSER_ERROR Error in document 'id:x:y:n=1:1FVzo2l7mMLticB0WMkBKIECMLzAg' - could not parse field 'content' of type 'string': The string field value contains illegal code point 0xB
I can see that there's a check for these sorts of characters (vertical tab in my case) com.yahoo.text.Text
in allowedAsciiChars
but I don't see anywhere in the documentation that I should be stripping these chars before sending to Vespa. In fact I see sort of the opposite situation where Vespa will go out of its way to replace certain chars behind the scenes without rejecting them.
Upvotes: 1
Views: 227
Reputation: 2339
I see sort of the opposite situation where Vespa will go out of its way to replace certain chars behind the scenes
Where do you see this?
There is a Text.stripInvalidCharacters utility method provided as a utility for clients in Java which need to strip characters from non-sanitized text.
Upvotes: 1
Reputation: 971
Please strip ASCII control characters from the documents before feeding.
I will update the documentation, although is seems the JSON spec says these control characters must be escaped, so these are implicitly not allowed in the feed
Upvotes: 2