Reputation: 13245
I'm trying to make a very basic tokenizer/lexer.
To do this, I'm making a main struct called Token
that all types of tokens will inherit from, such as IntToken
and PlusToken
.
Every new type of token will include a type
variable as a string, and a to_string
function, which returns a representation like: Token(PLUS)
or Token(INT, 5)
(5 would be replaced by whatever integer value it is);
I've looked at many questions on SO and it looks like I need to make a vector of type std::shared_ptr(BaseClass)
(in my case, BaseClass
would be Token
) https://stackoverflow.com/a/20127962/12101554
I have tried doing this how I would think that it should be made, but since it didn't work, I looked on SO and found the answer linked above, however it doesn't seem to be working.
Am I following the answer wrong, did I make some other error, or is this not possible to do in C++ without a lot of other code?
(I have also tried converting all the struct
's to class
's and adding public:
, but that makes no change)
#include <iostream>
#include <string>
#include <vector>
struct Token {
std::string type = "Uninitialized";
virtual std::string to_string() { return "Not implemented"; };
};
struct IntToken : public Token {
IntToken(int value) {
this->value = value;
}
std::string type = "INT";
int value;
std::string to_string() {
return "Token(INT, " + std::to_string(value) + ")";
}
};
struct PlusToken : public Token {
std::string type = "PLUS";
};
std::vector<std::shared_ptr<Token>> tokenize(std::string input) {
std::vector<std::shared_ptr<Token>> tokens;
for (int i = 0; i < input.length(); i++) {
char c = input[i];
if (std::isdigit(c)) {
std::cout << "Digit" << std::endl;
IntToken t = IntToken(c - 48);
std::cout << t.value << std::endl;
tokens.push_back(std::make_shared<IntToken>(t));
}
else if (c == '+') {
std::cout << "Plus" << std::endl;
PlusToken p = PlusToken();
tokens.push_back(std::make_shared<PlusToken>(p));
}
}
return tokens;
}
int main()
{
std::string input = "5+55";
std::vector<std::shared_ptr<Token>> tokens = tokenize(input);
for (int i = 0; i < tokens.size(); i++) {
//std::cout << tokens[i].to_string() << std::endl;
std::cout << tokens[i]->type << std::endl;
}
}
Current Output:
Digit
5
Plus
Digit
5
Digit
5
Uninitialized
Uninitialized
Uninitialized
Uninitialized
Expected Output: (with current code)
Digit
5
Plus
Digit
5
Digit
5
Token(INT, 5)
Token(PLUS)
Token(INT, 5)
Token(INT, 5)
Note: Yes, I know that the proper tokenization would be (5) (+) (55), but I'm still creating the basic part.
Upvotes: 1
Views: 115
Reputation: 10756
You are giving your derived classes their own type
member variables. Instead you should be setting the type
that belongs to the base class inside the derived-class constructors.
Upvotes: 1