lashgar
lashgar

Reputation: 5440

Unreserved characters in C/C++

I need to encode all occurrence of < character in a C/C++ code file. To prevent conflict, I need to know which characters are not reserved in C/C++ standard. For example, if $ is not reserved, I can encode < to $ temporarily and revive the original C/C++ code later.

I need this encoding for my C/C++ code in the XML-like intermediate language.

Thanks in advance.

Upvotes: 0

Views: 314

Answers (3)

James Kanze
James Kanze

Reputation: 153929

It depends on what you mean by "reserved". An implementation is only required to understand a very limited number of characters in input, with all others being input by means of universal character names. An implementation is allowed (and I would even say encouraged) to support more, see §2.2, point 1. In practice, there are (or should be) no reserved characters in comments, and in string and character literals (at least the wide character forms, and in C++11, the Unicode forms). Your best bet is probably something like quoted printable.

Upvotes: 1

Pubby
Pubby

Reputation: 53047

Rather than list unreserved characters (there are infinite), here are the reserved ones from 2.3.1 of the standard:

space, horizontal tab, vertical tab, form feed, new line
a through z
A through Z
0 through 9
_ { } [ ] # ( )  % : ; . ? * + - / ^ & | ~ ! = , \ " '

Upvotes: 5

simonc
simonc

Reputation: 42175

If you convert all < characters to $, how will you preserve any instances of $ in your original file?

Since you say you're targeting an XML-like intermediate language, why not use XML escaping and convert < to &lt instead? (You'll also need to convert & in that case, say to &amp.) There are lots of open source libraries available to help you do this. If you can't find any stand-alone module, here's code I've written which could have its XML (un)escaping functionality extracted.

Upvotes: 4

Related Questions