Reputation: 19694
I am trying to come up with a generic solution for parsing strings (with a given format). For instance, I would like to be able to parse a string containing a list of numeric values (integers or floats) and return a std::vector. This is what I have so far:
template<typename T, typename U>
T parse_value(const U& u) {
throw std::runtime_error("no parser available");
}
template<typename T>
std::vector<T> parse_value(const std::string& s) {
std::vector<std::string> parts;
boost::split(parts, s, boost::is_any_of(","));
std::vector<T> res;
std::transform(parts.begin(), parts.end(), std::back_inserter(res),
[](const std::string& s) { return boost::lexical_cast<T>(s); });
return res;
}
Additionally, I would like to be able to parse strings containing other type of values. For instance:
struct Foo { /* ... */ };
template<>
Foo parse_value(const std::string& s) {
/* parse string and return a Foo object */
}
The reason to maintain a single "hierarchy" of parse_value
functions is because, sometimes, I want to parse an optional value (which may exist or not), using boost::optional. Ideally, I would like to have just a single parse_optional_value
function that would delegate on the corresponding parse_value
function:
template<typename T>
boost::optional<T> parse_optional_value(const boost::optional<std::string>& s) {
if (!s) return boost::optional<T>();
return boost::optional<T>(parse_value<T>(*s));
}
So far, my current solution does not work (the compiler cannot deduce the exact function to use). I guess the problem is that my solution relies on deducing the template value based on the return type of parse_value
functions. I am not really sure how to fix this (or even whether it is possible to fix it, since the design approach could just be totally flawed). Does anyone know a way to solve what I am trying to do? I would really appreciate if you could just point me to a possible way to address the issues that I am having with my current implementation. BTW, I am definitely open to completely different ideas for solving this problem too.
Upvotes: 0
Views: 1297
Reputation: 1917
Here is an example of libsass parser:
const char* interpolant(const char* src) {
return recursive_scopes< exactly<hash_lbrace>, exactly<rbrace> >(src);
}
// Match a single character literal.
// Regex equivalent: /(?:x)/
template <char chr>
const char* exactly(const char* src) {
return *src == chr ? src + 1 : 0;
}
where rules could be passed into the lex method.
Upvotes: 0
Reputation: 241771
You cannot overload functions based on return value [1]. This is precisely why the standard IO library uses the construct:
std::cin >> a >> b;
which may not be your piece of cake -- many people don't like it, and it is truly not without its problems -- but it does a nice job of providing a target type to the parser. It also has the advantage over a static parse<X>(const std::string&)
prototype that it allows for chaining and streaming, as above. Sometimes that's not needed, but in many parsing contexts it is essential, and the use of operator>>
is actually a pretty cool syntax. [2]
The standard library doesn't do what would be far and away the coolest thing, which is to skip string constants scanf
style and allow interleaved reading.
vector<int> integers;
std::cin >> "[" >> interleave(integers, ",") >> "]";
However, that could be defined. (Possibly it would be better to use an explicit wrapper around the string literals, but actually I prefer it like that; but if you were passing a variable you'd want to use a wrapper).
[1] With the new auto
declaration, the reason for this becomes even clearer.
[2] IO manipulators, on the other hand, are a cruel joke. And error handling is pathetic. But you can't have everything.
Upvotes: 1