Reputation: 21
I have a problem with this C program that reads the content of a file and copies it in a string then prints it. when I allocate a string, it has always 3 strange characters in it, and I could solve that by putting '\0' at the beginning to like initialize it to an empty string, as shown in part 1 and 2. But when it comes to reading the file, even with that technique the 3 characters won't go, like shown in part 3.
Anyone knows why those 3 chars are printed, knowing that if I copy the string into another file, they don't appear in it; and why do they still appear when i read the file ?
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define length 20
int main() { /////////PART 1
char *T = (char*) malloc((length+1)*sizeof(char)) ;
printf("%s\n", T);
strcat(T, "hello") ;
printf("%s\n", T);
////////////////////////////////////PART 2
char *M = (char*) malloc((length+1)*sizeof(char)) ; M[0] = '\0' ;
printf("%s\n", M);
strcat(M, "hello") ;
printf("%s\n", M);
////////////////////////////////////PART 3
FILE *fil = fopen("test.txt", "r") ;
char *S = (char*) malloc((length+1)*sizeof(char)) ; S[0] = '\0' ;
fread(S, sizeof(char), length, fil);
S[length] = '\0' ;
printf("%s\n", S) ;
fclose(fil) ;
}
Upvotes: 0
Views: 79
Reputation: 149075
Could the 3 characters be 
or ´╗┐
? It is common to prepend a Byte Order Mark at the beginning of unicode text files. The BOM is the magic value 0xfeff.
On a UTF-8 encoded files it comes as 3 bytes "\xef\xbb\xbf"
, on a UTF-16 Little Endian, it is the 2 bytes "\xff\xfe"
and on a UTF-16 Big Endian, it is the 2 bytes "\xfe\xff"
.
If you are reading a file that contains a BOM, having those special characters is normal.
Upvotes: 2