mczarnek
mczarnek

Reputation: 1383

Is it possible to know how many characters long text read from a file will be in C?

I know in C++, you can check the length of the string, but in C, not so much.

Is it possible knowing the file size of a text file, to know how many characters are in the file?

Is it one byte per character or are other headers secretly stored whether or not I set them?

I would like to avoid performing a null check on every character as I iterate through the file for performance reasons.

Thanks.

Upvotes: 1

Views: 797

Answers (2)

Alex Reynolds
Alex Reynolds

Reputation: 96937

Characters (of type char) are single byte values, as defined in the C standard (see CHAR_BIT). A NUL character is also a character, and so it, too, takes up a single byte.

Thus, if you are working with an ASCII text file, the file size will be the number of bytes and therefore equivalent to the number of characters.

If you are asking how long individual strings are inside the file, then you will indeed need to look for NUL and other extended character bytes and calculate string lengths on that basis. You might not be able to safely assume that there is only one NUL character and that it is at the end of the file, depending on how that file was made. There can also be newlines and other extended characters you would want to exclude. You have to decide on a character set and do counting from that set.

Further, if you are working with a file containing multibyte characters encoded in, say, Unicode, then this will be a different answer. You would use different functions to read a text file using a multibyte encoding.

So the answer will depend on what type of encoding your text file uses, and whether you are calculating characters or string lengths, which are two different measures.

Upvotes: 0

M.M
M.M

Reputation: 141554

You can open the file and read all the characters and count them.

Besides that, there's no fully portable method to check how long a file is -- neither on disk, nor in terms of how many characters will be read. This is true for text files and binary files.

How do you determine the size of a file in C? goes over some of the pitfalls. Perhaps one of the solutions there will suit a subset of systems that you run your code on; or you might like to use a POSIX or operating system call.


As mentioned in comments; if the intent behind the question is to read characters and process them on the fly, then you still need to check for read errors even if you knew the file size, because reading can fail.

Upvotes: 4

Related Questions