mbq
mbq

Reputation: 18628

Converting special characters (like \n) to their escaped versions

How to convert for instance "A\r\nB\tC\nD" to "A\\r\\nB\\tC\\nD" in C(++)?

Ideally using standard library only and a bonus upvote for both pure C and pure C++ solutions.

Upvotes: 0

Views: 4818

Answers (5)

Bob Black
Bob Black

Reputation: 2405

Here's an algorithm in C#. Maybe you can treat it like pseudo-code and convert it to C++.

public static string EscapeChars(string Input) { string Output = "";

foreach (char c in Input)
{
    switch (c)
    {
        case '\n':
            Output += "\\n";
            break;
        case '\r':
            Output += "\\r";
            break;
        case '\t':
            Output += "\\t";
            break;
        default:
            Output += c;
            break;
    }                
}
return Output;

}

Upvotes: 0

Daniel Sloof
Daniel Sloof

Reputation: 12706

Here is something I came up with...

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

inline char needs_escaping(char val) {
        switch(val) {
                case '\n': return 'n';
                case '\r': return 'r';
                case '\t': return 't';
        }
        return 0;
}

char *escape_string(char *in) {
        unsigned int needed = 0, j = 0, length = strlen(in), i;
        for(i = 0; i < length; i++) {
                if(needs_escaping(in[i])) needed++;
        }

        char *out = malloc(length + needed + 1);
        for(i = 0; i < length; i++) {
                char escape_val = needs_escaping(in[i]);
                if(escape_val) {
                        out[j++] = '\\';
                        out[j++] = escape_val;
                }
                else {
                        out[j++] = in[i];
                }
        }
        out[length + needed] = '\0';    
        return out;
}

int main() {
        char *in  = "A\r\nB\tC\nD";
        char *out = escape_string(in);
        printf("%s\n", out);
        free(out);
        return 0;
}

Upvotes: 2

Billy ONeal
Billy ONeal

Reputation: 106609

Of course, replace char with wchar_t and std::string with std::wstring if you're using wide character strings.

std::string input(/* ... */);
std::string output;
for(std::string::const_iterator it = input.begin(); it != input.end(); ++it)
{
    char currentValue = *it;
    switch (currentValue)
    {
    case L'\t':
        output.append("\\t");
        break;
    case L'\\':
        output.append("\\\\");
        break;
    //.... etc.
    default:
        output.push_back(currentValue);
    }
}

You can do this in C but it's going to be more difficult because you don't know the buffer size in advance (Though you can make a worst case guess of 2 times the size of the original string). I.e.

//Disclaimer; it's been a while since I've written pure C, so this may
//have a bug or two.
const char * input = // ...;
size_t inputLen = strlen(input);
char * output = malloc(inputLen * 2);
const char * inputPtr = input;
char * outputPtr = output;
do
{
    char currentValue = *inputPtr;
    switch (currentValue)
    {
    case L'\t':
        *outputPtr++ = '\\';
        *outputPtr = 't';
        break;
    case L'\\':
        *outputPtr++ = '\\';
        *outputPtr = '\\';
        break;
    //.... etc.
    default:
        *outputPtr = currentValue;
    }
} while (++outputPtr, *inputPtr++);

(Remember to add error handling to the C version for things like malloc failures ;) )

Upvotes: 3

NPE
NPE

Reputation: 500773

I would create a lookup table of 32 const char* literals, one for every control code (ASCII 0 to ASCII 31). I would then iterate over the original string, copying non-control chars (ASCII >= 32) to the output buffer and substituting values from the lookup table for ASCII 0--31.

Note 1: ASCII 0 is obviously special for C strings (not so for C++.)

Note 2: The lookup table would contain C escape sequences for codes that have them (\n, \r etc) and backslash plus hex/octal/decimal codes for those that don't.

Upvotes: 0

Oliver Charlesworth
Oliver Charlesworth

Reputation: 272677

I doubt there's any standard library function that does this directly. The most efficient way would be simply to iterate over the input buffer character by character, conditionally copying into an output buffer, with some special state-machine logic to handle '\', etc.

I'm sure there are ways to do this with various combinations of strchr() et al, but it will probably be less efficient in the general case.

Upvotes: 1

Related Questions