Reputation: 92026
I am reading a text file in my program which contains some Unicode BOM character \ufeff
/65279
in places. This presents several issues in further parsing.
Right now I am detecting and filtering these characters myself but would like to know if Java standard library or Guava has a way to do this more cleanly.
Upvotes: 4
Views: 6721
Reputation: 61148
There is no built in way of dealing with a (UTF-8) BOM in Java or, indeed, in Guava.
There is currently a bug report on the Guava website about dealing with a BOM in Guava IO.
There are several SO posts (here and here) on how to detect/skip the BOM while reading a file in plain Java.
Your BOM (\ufeff
) seems to be UTF-16 which, according to the same Guava report should be dealt with automatically by Java. This SO post seems suggest the same.
Upvotes: 10