Reputation: 21154
I have a bug report showing an EEncodingError
. The log points to TFile.AppendAllText
. I call TFile.AppendAllText
is this procedure of mine:
procedure WriteToFile(CONST FileName: string; CONST uString: string; CONST WriteOp: WriteOpperation; ForceFolder: Boolean= FALSE); // Works with UNC paths
begin
if NOT ForceFolder
OR (ForceFolder AND ForceDirectoriesMsg(ExtractFilePath(FileName))) then
if WriteOp= (woOverwrite)
then IOUtils.TFile.WriteAllText (FileName, uString)
else IOUtils.TFile.AppendAllText(FileName, uString);
end;
This is the information from EurekaLog.
What can cause this to happen?
Upvotes: 20
Views: 45172
Reputation: 21154
Proper function to write Unicode strings to a UTF8 file. FileName must be a full path. If the path does not exist, it is created. It can also write a preamble.
TYPE
TWriteOperation= (woAppend, woOverwrite);
procedure StringToFile(CONST FileName: string; CONST aString: String; CONST WriteOp: TWriteOperation= woOverwrite; WritePreamble: Boolean= FALSE);
VAR
Stream: TFileStream;
Preamble: TBytes;
sUTF8: RawByteString;
aMode: Integer;
begin
ForceDirectories(ExtractFilePath(FileName));
if (WriteOp= woAppend) AND FileExists(FileName)
then aMode := fmOpenReadWrite
else aMode := fmCreate;
Stream := TFileStream.Create(filename, aMode, fmShareDenyWrite); { Allow others to read while we write }
TRY
sUTF8 := Utf8Encode(aString); { UTF16 to UTF8 encoding conversion. It will convert UnicodeString to WideString }
if (aMode = fmCreate) AND WritePreamble then
begin
preamble := TEncoding.UTF8.GetPreamble;
Stream.WriteBuffer( PAnsiChar(preamble)^, Length(preamble));
end;
if aMode = fmOpenReadWrite
then Stream.Position:= Stream.Size; { Go to the end }
Stream.WriteBuffer( PAnsiChar(sUTF8)^, Length(sUTF8) );
FINALLY
FreeAndNil(Stream);
END;
end;
{ Tries to auto-determine the file type (ANSI, UTF8, UTF16, etc). Works with UNC paths.
If the file does not exist, it raises an error unless, IgnoreExists is True.
If it cannot detect the correct encoding automatically, we can force it to what we want by setting the second paramater.
Example: System.SysUtils.TEncoding.UTF8
However, this is buggy! It will raise an exception if the file is ANSI, but it contains high characters such as ½ (#189) }
function StringFromFile(CONST FileName: string; IgnoreExists: Boolean= FALSE; Enc: TEncoding= NIL): String;
begin
if IgnoreExists AND NOT FileExists(FileName)
then EXIT('');
if Enc= NIL
then Result:= System.IOUtils.TFile.ReadAllText(FileName)
else Result:= System.IOUtils.TFile.ReadAllText(FileName, Enc);
end;
{ Read a WHOLE file and return its content as AnsiString.
The function will not try to auto-determine the file's type.
It will simply read the file as ANSI }
function StringFromFileA(CONST FileName: string): AnsiString;
VAR Stream: TFileStream;
begin
Result:= '';
Stream:= TFileStream.Create(FileName, fmOpenRead OR fmShareDenyNone);
TRY
if Stream.Size>= High(Longint)
then RAISE Exception.Create('File is larger than 2GB! Only files below 2GB are supported.'+ CRLFw+ FileName);
SetString(Result, NIL, Stream.Size);
Stream.ReadBuffer(Pointer(Result)^, Stream.Size);
FINALLY
FreeAndNil(Stream);
END;
end;
This code was extracted from the LightSaber Delphi library.
Upvotes: 0
Reputation: 1
In this way it will work:
TFile.WriteAllText(FileName, 'é', TEncoding.UTF8);
Upvotes: 0
Reputation: 613063
This program reproduces the error that you report:
{$APPTYPE CONSOLE}
uses
System.SysUtils, System.IOUtils;
var
FileName: string;
begin
try
FileName := TPath.GetTempFileName;
TFile.WriteAllText(FileName, 'é', TEncoding.ANSI);
TFile.AppendAllText(FileName, 'é');
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
end.
Here I have written the original file as ANSI. And then called AppendAllText
which will try to write as UTF-8. What happens is that we end up in this function:
class procedure TFile.AppendAllText(const Path, Contents: string);
var
LFileStream: TFileStream;
LFileEncoding: TEncoding; // encoding of the file
Buff: TBytes;
Preamble: TBytes;
UTFStr: TBytes;
UTF8Str: TBytes;
begin
CheckAppendAllTextParameters(Path, nil, False);
LFileStream := nil;
try
try
LFileStream := DoCreateOpenFile(Path);
// detect the file encoding
LFileEncoding := GetEncoding(LFileStream);
// file is written is ASCII (default ANSI code page)
if LFileEncoding = TEncoding.ANSI then
begin
// Contents can be represented as ASCII;
// append the contents in ASCII
UTFStr := TEncoding.ANSI.GetBytes(Contents);
UTF8Str := TEncoding.UTF8.GetBytes(Contents);
if TEncoding.UTF8.GetString(UTFStr) = TEncoding.UTF8.GetString(UTF8Str) then
begin
LFileStream.Seek(0, TSeekOrigin.soEnd);
Buff := TEncoding.ANSI.GetBytes(Contents);
end
// Contents can be represented only in UTF-8;
// convert file and Contents encodings to UTF-8
else
begin
// convert file contents to UTF-8
LFileStream.Seek(0, TSeekOrigin.soBeginning);
SetLength(Buff, LFileStream.Size);
LFileStream.ReadBuffer(Buff, Length(Buff));
Buff := TEncoding.Convert(LFileEncoding, TEncoding.UTF8, Buff);
// prepare the stream to rewrite the converted file contents
LFileStream.Size := Length(Buff);
LFileStream.Seek(0, TSeekOrigin.soBeginning);
Preamble := TEncoding.UTF8.GetPreamble;
LFileStream.WriteBuffer(Preamble, Length(Preamble));
LFileStream.WriteBuffer(Buff, Length(Buff));
// convert Contents in UTF-8
Buff := TEncoding.UTF8.GetBytes(Contents);
end;
end
// file is written either in UTF-8 or Unicode (BE or LE);
// append Contents encoded in UTF-8 to the file
else
begin
LFileStream.Seek(0, TSeekOrigin.soEnd);
Buff := TEncoding.UTF8.GetBytes(Contents);
end;
// write Contents to the stream
LFileStream.WriteBuffer(Buff, Length(Buff));
except
on E: EFileStreamError do
raise EInOutError.Create(E.Message);
end;
finally
LFileStream.Free;
end;
end;
The error stems from this line:
if TEncoding.UTF8.GetString(UTFStr) = TEncoding.UTF8.GetString(UTF8Str) then
The problem is that UTFStr
is not in fact valid UTF-8
. And hence TEncoding.UTF8.GetString(UTFStr)
throws an exception.
This is a defect in TFile.AppendAllBytes
. Given that it knows perfectly well that UTFStr
is ANSI
encoded, it makes no sense at all for it to call TEncoding.UTF8.GetString
.
You should submit a bug report to Embarcadero for this defect which still exists in Delphi 10 Seattle. In the meantime you should not use TFile.AppendAllBytes
.
Upvotes: 24