Reputation: 273
I'm writing an application where i need to determine if the files provided from the user are text or not because i'm performing a search within them.
I'm not basing on the extension, cause i want to search also in source code files for example, or any other file that have a textual content (even for not well known extensions).
Is there a way to determine if a file is text or not?
Upvotes: 1
Views: 1207
Reputation: 11426
try the following way:
func isBinary(_ path: String) -> Bool? {
if FileManager.default.fileExists(atPath: path) == false {
return nil
}
return ( try? String(contentsOfFile: path) ) == nil
}
problem of this code is that it's will not work with large files (checking will be long)
Upvotes: -1
Reputation: 273
Thanks everyone for the solutions provided! I just found a framework that seems to do the job quite well!
I leave here a link for reference: https://github.com/aidansteele/MagicKit
Upvotes: 1
Reputation: 54781
You would need to open and read the data.
For ASCII text files, this means checking the characters are in the printable range.
For UTF text files, you may need to read the BOM (Byte Order Mark) first to determine encoding before reading the rest of the file.
Read more here: http://en.wikipedia.org/wiki/Text_file
Upvotes: 1
Reputation: 10139
There is no way to be certain. But note that most of the control characters would not appear in an ASCII file. You can make a pretty good guess by making a subset of most of the ASCII control characters. Then count the number of characters in the file that are in the subset, the count should be zero for an ASCII file. But in the final analysis you must prove a negative, which is a troublesome thing to do.
Upvotes: 0