Reputation: 18966
I have a C code which reads 1 line at a time, from a file opened in text mode using
fgets(buf,200,fin);
The input file which fgets() reads lines from, is an command line argument to the program.
Now fgets leaves the newline character included in the string copied to buf.
Somewhere do the line in the code I check
length = strlen(buf);
For some input files , which I guess are edited in *nix environment newline character is just '\n'
But for some other test case input files(which I guess are edited/created under Windows environment) have 2 characters indicating a newline - '\r''\n'
I want to remove the newline character and want to put a '\0' as the string terminator character. So I have to either do -
if(len == (N+1))
{
if(buf[length-1] == '\n')
{
buf[length-2] = '\0'; //for a `\r\n` newline
}
}
or
if(len == (N))
{
if(buf[length-1] == '\n')
{
buf[length-1] = '\0'; //for a `\n` newline
}
}
Since the text files are passed as commandline argument to the program I have no control of how it is edited/composed and hence cannot filter it using some tool to make newlines consistent.
How can I handle this situation?
Is there any fgets equivalent function in standard C library(no extensions) which can handle these inconsistent newline characters and return a string without them?
Upvotes: 1
Views: 11941
Reputation: 16512
I think your best (and easiest) option is to write your own strlen function:
size_t zstrlen(char *line)
{
char *s = line;
while (*s && *s != '\r' && s != '\n) s++;
*s = '\0';
return (s - line);
}
Now, to calculate the length of the string excluding the newline character(s) and eliminating it(/them) you simply do:
fgets(buf,200,fin);
length = zstrlen(buf);
It works for Unix style ('\n'), Windows style ('\r\n') and old Mac style ('\r').
Note that there are faster (but non-portable) implementation of strlen that you can adapt to your needs.
Hope it helps, RD:
Upvotes: 1
Reputation: 10558
If you are troubled by the different line endings (\n
and \r\n
) on different machines, one way to neutralize them would be to use the dos2unix
command (assuming you are working on linux and have files edited in a Windows environment). That command would replace all window-style line endings with linux-style line endings. The reverse unix2dos
also exists. You can call these utilities from within the C program (system
maybe) and then process the line like you are currently doing. This would reduce the burden on your program.
Upvotes: 0
Reputation: 108988
I like to update length
at the same time
if (buf[length - 1] == '\n') buf[--length] = 0;
if (buf[length - 1] == '\r') buf[--length] = 0;
or, to remove all trailing whitespace
/* remember to #include <ctype.h> */
while ((length > 0) && isspace((unsigned char)buf[length - 1])) {
buf[--length] = 0;
}
Upvotes: 2