Anand
Anand

Reputation: 314

Convert special ASCII Character to XML Compatible string in C++

Is there an API C++ which I can use to convert special character to XML comptible string? Example change

We're sorry, <your> item is out of stock will not be Δ available  (until next month). ÿ

to

We're sorry, &#x03C;your&#x03E; item is out of stock will not be &#x03F; available  &#x028;until next month&#x029;. &#x0FF;

Let me explain a bit more about my problem. I work on a Legacy server, which pulls out data reports in Flat files. In to older versions of our client Application, everything works fine with these special chars like <, >, ( etc.

We now are developing a new client, but this client accepts a XML string and render the report on a PHP page. So we made changes to allow the system to output a XML file. But when the XML reaches the client, and if the string contains a funny char like < or ) etc, the DOM parser inside the PHP page starts throwing error. What I want to do is while the XML fine is being created, and if some such special chars are I need to display them in the report hence need to escape < as &#x03C

I API a function InternetCanonicalizeUrl which will change a string for URL encoding. I want something similar for XML

Upvotes: 0

Views: 2480

Answers (2)

Anand
Anand

Reputation: 314

Create an arrage of XML encoded ASCII

 //ASCII to To XML Encoding char map. 
 //Each index in the array represents a ASCII char, and the corresponding XML          
 //endcoded string.
//AB 2013/08/02
static const char m_arrAsciiMap[256][8] 
= 
{
    "&#x000;",  "&#x001;",  "&#x002;",  "&#x003;",  "&#x004;",  "&#x005;",  "&#x006;",  "&#x007;",  "&#x008;",  "&#x009;",  "&#x00A;",  "&#x00B;",  "&#x00C;",  "&#x00D;",  "&#x00E;",  "&#x00F;",
    "&#x010;",  "&#x011;",  "&#x012;",  "&#x013;",  "&#x014;",  "&#x015;",  "&#x016;",  "&#x017;",  "&#x018;",  "&#x019;",  "&#x01A;",  "&#x01B;",  "&#x01C;",  "&#x01D;",  "&#x01E;",  "&#x01F;",
    "&#x020;",  "&#x021;",  "&#x022;",  "&#x023;",  "&#x024;",  "&#x025;",  "&#x026;",  "&#x027;",  "&#x028;",  "&#x029;",  "&#x02A;",  "&#x02B;",  "&#x02C;",  "&#x02D;",  "&#x02E;",  "&#x02F;",
    "&#x030;",  "&#x031;",  "&#x032;",  "&#x033;",  "&#x034;",  "&#x035;",  "&#x036;",  "&#x037;",  "&#x038;",  "&#x039;",  "&#x03A;",  "&#x03B;",  "&#x03C;",  "&#x03D;",  "&#x03E;",  "&#x03F;",
    "&#x040;",  "&#x041;",  "&#x042;",  "&#x043;",  "&#x044;",  "&#x045;",  "&#x046;",  "&#x047;",  "&#x048;",  "&#x049;",  "&#x04A;",  "&#x04B;",  "&#x04C;",  "&#x04D;",  "&#x04E;",  "&#x04F;",  
    "&#x050;",  "&#x051;",  "&#x052;",  "&#x053;",  "&#x054;",  "&#x055;",  "&#x056;",  "&#x057;",  "&#x058;",  "&#x059;",  "&#x05A;",  "&#x05B;",  "&#x05C;",  "&#x05D;",  "&#x05E;",  "&#x05F;",  
    "&#x060;",  "&#x061;",  "&#x062;",  "&#x063;",  "&#x064;",  "&#x065;",  "&#x066;",  "&#x067;",  "&#x068;",  "&#x069;",  "&#x06A;",  "&#x06B;",  "&#x06C;",  "&#x06D;",  "&#x06E;",  "&#x06F;",
    "&#x070;",  "&#x071;",  "&#x072;",  "&#x073;",  "&#x074;",  "&#x075;",  "&#x076;",  "&#x077;",  "&#x078;",  "&#x079;",  "&#x07A;",  "&#x07B;",  "&#x07C;",  "&#x07D;",  "&#x07E;",  "&#x07F;",  
    "&#x080;",  "&#x081;",  "&#x082;",  "&#x083;",  "&#x084;",  "&#x085;",  "&#x086;",  "&#x087;",  "&#x088;",  "&#x089;",  "&#x08A;",  "&#x08B;",  "&#x08C;",  "&#x08D;",  "&#x08E;",  "&#x08F;",  
    "&#x090;",  "&#x091;",  "&#x092;",  "&#x093;",  "&#x094;",  "&#x095;",  "&#x096;",  "&#x097;",  "&#x098;",  "&#x099;",  "&#x09A;",  "&#x09B;",  "&#x09C;",  "&#x09D;",  "&#x09E;",  "&#x09F;",  
    "&#x0A0;",  "&#x0A1;",  "&#x0A2;",  "&#x0A3;",  "&#x0A4;",  "&#x0A5;",  "&#x0A6;",  "&#x0A7;",  "&#x0A8;",  "&#x0A9;",  "&#x0AA;",  "&#x0AB;",  "&#x0AC;",  "&#x0AD;",  "&#x0AE;",  "&#x0AF;",
    "&#x0B0;",  "&#x0B1;",  "&#x0B2;",  "&#x0B3;",  "&#x0B4;",  "&#x0B5;",  "&#x0B6;",  "&#x0B7;",  "&#x0B8;",  "&#x0B9;",  "&#x0BA;",  "&#x0BB;",  "&#x0BC;",  "&#x0BD;",  "&#x0BE;",  "&#x0BF;",
    "&#x0C0;",  "&#x0C1;",  "&#x0C2;",  "&#x0C3;",  "&#x0C4;",  "&#x0C5;",  "&#x0C6;",  "&#x0C7;",  "&#x0C8;",  "&#x0C9;",  "&#x0CA;",  "&#x0CB;",  "&#x0CC;",  "&#x0CD;",  "&#x0CE;",  "&#x0CF;",  
    "&#x0D0;",  "&#x0D1;",  "&#x0D2;",  "&#x0D3;",  "&#x0D4;",  "&#x0D5;",  "&#x0D6;",  "&#x0D7;",  "&#x0D8;",  "&#x0D9;",  "&#x0DA;",  "&#x0DB;",  "&#x0DC;",  "&#x0DD;",  "&#x0DE;",  "&#x0DF;",  
    "&#x0E0;",  "&#x0E1;",  "&#x0E2;",  "&#x0E3;",  "&#x0E4;",  "&#x0E5;",  "&#x0E6;",  "&#x0E7;",  "&#x0E8;",  "&#x0E9;",  "&#x0EA;",  "&#x0EB;",  "&#x0EC;",  "&#x0ED;",  "&#x0EE;",  "&#x0EF;",
    "&#x0F0;",  "&#x0F1;",  "&#x0F2;",  "&#x0F3;",  "&#x0F4;",  "&#x0F5;",  "&#x0F6;",  "&#x0F7;",  "&#x0F8;",  "&#x0F9;",  "&#x0FA;",  "&#x0FB;",  "&#x0FC;",  "&#x0FD;",  "&#x0FE;",  "&#x0FF;",
};

//Function converts, all non XML allowable ASCII chars to //XML encoded string

void XMLEncodeString(char *pDestBuffer, char *SourceBuffer)
{
    int buffLen = strlen(SourceBuffer);
    int CurrentPointerPos = 0;  
    for(int i = 0; i < buffLen; i++)
    {
        if ((((BYTE)SourceBuffer[i]) >= 32 && ((BYTE)SourceBuffer[i]) <= 37)                        
         || (((BYTE)SourceBuffer[i]) == 39 )
         || (((BYTE)SourceBuffer[i]) >= 42 && ((BYTE)SourceBuffer[i]) <= 59) 
         || (((BYTE)SourceBuffer[i]) >= 64 && ((BYTE)SourceBuffer[i]) <= 122))
        {
//Check if the Chars are allowed, if yes then dont convert to XML encoded string
//Numbers, Alphabets upper and lower case can be ignored, certain special chars 
// can also be ignored
            pDestBuffer[CurrentPointerPos] = SourceBuffer[i];
            CurrentPointerPos++;
        }
        else
        {
//If the char is not allowed in XML string convert it to the XML encoded equivalent. 
//Replace the single char with the XML encoded string e.g < with &#x03C;
            memcpy((pDestBuffer + CurrentPointerPos),  m_arrAsciiMap[(BYTE)SourceBuffer[i]], strlen(m_arrAsciiMap[(BYTE)SourceBuffer[i]]));
            CurrentPointerPos += strlen(m_arrAsciiMap[(BYTE)SourceBuffer[i]]);
        }
    }
}

Upvotes: 1

UberJoker
UberJoker

Reputation: 1

Could you please clarify your question?

I'm not sure why you want to use any sort of API. API is an interface you build to extract data from a system. In any case, For processing a string like that, you could make use of switch case.

Could go like:

switch ( <variable> ) {
case this-value:
  Code to execute if <variable> == this-value
  break;
case that-value:
  Code to execute if <variable> == that-value
  break;
...
default:
  Code to execute if <variable> does not equal the value following any of the cases
  break;
}

Upvotes: 0

Related Questions