Xel Naga
Xel Naga

Reputation: 956

Check if a file is a text file or a binary file using Delphi

I want to check if a file is a plain-text file. I tried the code below:

function IsTextFile(const sFile: TFileName): boolean;
//Created By Marcelo Castro - from Brazil
var
 oIn: TFileStream;
 iRead: Integer;
 iMaxRead: Integer;
 iData: Byte;
 dummy:string;
begin
 result:=true;
 dummy :='';
 oIn := TFileStream.Create(sFile, fmOpenRead or fmShareDenyNone);
 try
   iMaxRead := 1000;  //only text the first 1000 bytes
   if iMaxRead > oIn.Size then
     iMaxRead := oIn.Size;
   for iRead := 1 to iMaxRead do
   begin
     oIn.Read(iData, 1);
     if (idata) > 127 then result:=false;
   end;
 finally
   FreeAndNil(oIn);
 end;
end;

This function works pretty well for text files based on ASCII chars. But text files can also include non-English chars. This function returns FALSE for non-English text files.

Is there any way to check if a file is a text file or a binary file?

Upvotes: 2

Views: 1075

Answers (1)

Martial P
Martial P

Reputation: 395

You can't detect the codepage, you need to be told it. You can analyse the bytes and guess it, but that can give some bizarre (sometimes amusing) results. I can't find it now, but I'm sure Notepad can be tricked into displaying English text in Chinese.

It does not make sense to have a string without knowing what encoding it uses. You can no longer stick your head in the sand and pretend that "plain" text is ASCII. There Ain't No Such Thing As Plain Text. If you have a string, in memory, in a file, or in an email message, you have to know what encoding it is in or you cannot interpret it or display it to users correctly.

That's the first answer from here : How can I detect the encoding/codepage of a text file

You also should figure out any binary file can be a text in an uncommun encoding. Also, binary files encoded in Base64 will just bypass any test you will think of, as it is by definition a text representation of a binary stream.

Upvotes: 1

Related Questions