Reputation: 956
I want to check if a file is a plain-text file. I tried the code below:
function IsTextFile(const sFile: TFileName): boolean;
//Created By Marcelo Castro - from Brazil
var
oIn: TFileStream;
iRead: Integer;
iMaxRead: Integer;
iData: Byte;
dummy:string;
begin
result:=true;
dummy :='';
oIn := TFileStream.Create(sFile, fmOpenRead or fmShareDenyNone);
try
iMaxRead := 1000; //only text the first 1000 bytes
if iMaxRead > oIn.Size then
iMaxRead := oIn.Size;
for iRead := 1 to iMaxRead do
begin
oIn.Read(iData, 1);
if (idata) > 127 then result:=false;
end;
finally
FreeAndNil(oIn);
end;
end;
This function works pretty well for text files based on ASCII chars. But text files can also include non-English chars. This function returns FALSE for non-English text files.
Is there any way to check if a file is a text file or a binary file?
Upvotes: 2
Views: 1075
Reputation: 395
You can't detect the codepage, you need to be told it. You can analyse the bytes and guess it, but that can give some bizarre (sometimes amusing) results. I can't find it now, but I'm sure Notepad can be tricked into displaying English text in Chinese.
It does not make sense to have a string without knowing what encoding it uses. You can no longer stick your head in the sand and pretend that "plain" text is ASCII. There Ain't No Such Thing As Plain Text. If you have a string, in memory, in a file, or in an email message, you have to know what encoding it is in or you cannot interpret it or display it to users correctly.
That's the first answer from here : How can I detect the encoding/codepage of a text file
You also should figure out any binary file can be a text in an uncommun encoding. Also, binary files encoded in Base64 will just bypass any test you will think of, as it is by definition a text representation of a binary stream.
Upvotes: 1