Reputation: 6023
Here when I get file size using stat()
it gives different output, why does it behave like this?
When "huffman.txt" contains a simple string like "Hi how are you" it gives file_size = 14
. But when "huffman.txt" contains a string like "άSUä5Ñ®qøá"F" it gives file size = 30
.
#include <sys/stat.h>
#include <stdio.h>
int main()
{
int size = 0;
FILE* original_fileptr = fopen("huffman.txt", "rb");
if (original_fileptr == NULL) {
printf("ERROR: fopen fail in %s at %d\n", __FUNCTION__, __LINE__);
return 1;
}
/*create variable of stat*/
struct stat stp = { 0 };
stat("huffman.txt", &stp);
/*determine the size of data which is in file*/
int filesize = stp.st_size;
printf("\nFile size is %d\n", filesize);
}
Upvotes: 0
Views: 1425
Reputation: 56
This has got to do with encoding.
Plain-text english characters are encoded in ASCII, where each character is one byte. However, characters in non-plain text english are encoded in Unicode each being 2-byte.
Easiest way to see what is happening is to print each character using
char c;
/* Read file. */
while (c = fgetc())
printf ("%c", c)
You'll understand why the file size is different.
Upvotes: 4
Reputation: 14427
If you're asking why different strings with the same number of characters could have different sizes in bytes, read up on UTF-8
Upvotes: 0