user145610
user145610

Reputation: 3035

Convert a file to UTF-8 without streamwriter

Is there a way to convert File Stream data to UTF-8 File Stream without making use of Stream Writer, as of now i reading line by line and writing to UTF-8 file, is there any faster way of converting file to UTF-8 encoding

 using(StreamWriter writer = new StreamWriter(destinationFile, System.Text.Encoding.UTF8)) {
     string line = "";
     while ((line = reader.ReadLine()) != null) {
         writer.WriteLine(line);
     }

 }

Is there any overload method in Memory Stream or FileStream to convert file to UTF8 encoded File

Upvotes: 0

Views: 1530

Answers (1)

pid
pid

Reputation: 11607

Yes:

string text = File.ReadAllText(srcFilename);
File.WriteAllText(dstFilename, text, System.Text.Encoding.UTF8);

EDIT: reply to request in comment

Surrogates are UTF-8 characters that require more than one byte (at least 2 but there may be more). Let's say a block is 1024 bytes long (this problem arises for any block length, but: the larger blocks are, less is the probability to break a surrogate). A surrogate is broken when it spans across a block boundary, as shown here:

block index character comment
0     0     a         block start
0     1     b
...
0     1022  a
0     1023  €         block end, this character is 3 bytes long
---------------------
1     1024  € (+1)    second surrogate byte of character
1     1025  € (+2)    third surrogate byte of character
...

As you can see, the three-byte character would be broken up between two blocks. When streaming in/out a block at a time, these cases have to be handled correctly in code.

For more examples and explanations with actual codes see Wikipedia, I could possibly not be more thorough and precise than they already are.

Upvotes: 2

Related Questions