Reputation: 59
I'm trying to write a function which would convert all escape sequences in a string in their non-printable form. Basically if I have a string "This \n makes a new line", I would like it to be "This makes a new line". So far I've got this. I'm calling from main:
int main()
{
unescape("This \\n\\n is \\t\\t\\t string number \\t 7.");
return 0;
}
char* unescape(char* s)
{
char *esc[2] = {"\\n", "\\t"};
int i;
char* uus = (char*)calloc(80, sizeof(char));
char* uus2 = (char*)calloc(80,sizeof(char));
strncpy(uus, s, strlen(s));
for(i = 0; i < 2; i++)
{
while(strstr(uus, esc[i]) != NULL) //checks if \\n can be found
{
//printf("\n\n%p\n\n", strstr(uus, esc[i]));
int c = strstr(uus, esc[i]) - uus; //gets the difference between the address of the beginning of the string and the location
//where the searchable string was found
uus2 = strncpy(uus2, uus, c); //copies the beginning of the string to a new string
//add check which esc is being used
strcat(uus2, "\n"); //adds the non-printable form of the escape sequence
printf("%s", uus2);
//should clear the string uus before writing uus2 to it
strncpy(uus, uus2, strlen(uus2)); //copies the string uus2 to uus so it can be checked again
}
}
//this should return something in the end.
}
Basically, what I need to do now, is take the part from the string uus after "\n" and add it to the string uus2 so I can run the while loop again. I thought about using strtok but hit a wall as it makes two separate strings using some kind of delimiter which is not always there in my case.
edit: Adding the rest of the string to uus2 should be before strncpy. This is the code without it.
edit vol2: This is the code that works and which I ended up using. Basically edited Ruud's version a bit as the function I had to use had to return a string. Thanks a lot.
char* unescape(char* s)
{
char *uus = (char*) calloc(80, sizeof(char));
int i = 0;
while (*s != '\0')
{
char c = *s++;
if (c == '\\' && *s != '\0')
{
c = *s++;
switch (c)
{
case 'n': c = '\n'; break;
case 't': c = '\t'; break;
}
}
uus[i] = c;
i++;
}
uus[i] = '\0';
return uus;
}
Upvotes: 0
Views: 132
Reputation: 11018
I agree with Anonymouse. It is both clumsy and inefficient to replace first all \n
, then all \t
. Instead, make a single pass through the string, replacing all escape characters as you go.
I left the space allocation out in the code sample below; IMHO this is a separate responsibility, not a part of the algorithm, and as such does not belong in the same function.
void unescape(char *target, const char *source)
{
while (*source != '\0')
{
char c = *source++;
if (c == '\\' && *source != '\0')
{
c = *source++;
switch (c)
{
case 'n': c = '\n'; break;
case 't': c = '\t'; break;
}
}
*target++ = c;
}
*target = '\0';
}
EDIT:
Here's an alternative version, using strchr
as suggested by Anonymouse.
This implementation should be faster, especially on very long strings with relatively few escape characters.
I posted it primarily as a demonstration of how optimizations can make your code more complex and less readable; and consequently less maintainable and more error-prone. For a detailed discussion, see: http://c2.com/cgi/wiki?OptimizeLater
void unescape(char *target, const char *source)
{
while (*source != '\0')
{
if (*source++ == '\\' && *source != '\0')
{
char c = *source++;
switch (c)
{
case 'n': c = '\n'; break;
case 't': c = '\t'; break;
}
*target++ = c;
}
else
{
const char *escape = strchr(source--, '\\');
int numberOfChars = escape != NULL ? escape - source : strlen(source);
strncpy(target, source, numberOfChars);
target += numberOfChars;
source += numberOfChars;
}
}
*target = '\0';
}
Upvotes: 2
Reputation: 945
You'd be better using this...
char *p;
p = input_string;
while ((p=strchr (p, '\\')) != NULL)
{
if (p [1] == '\\')
{
switch (p [2])
{
case 'n' :
// handle \n
break;
case 't' :
// handle tab
break;
}
}
else
p++;
}
Upvotes: 1