Sebas Silva
Sebas Silva

Reputation: 115

How can I differentiate between any word and a specific word with lex in ply, python?

Im writing a program that recognize if it is a particular instruction or if it is an ID for an instruction to handle so what this program prints is:

LexToken(ID,'Sets',1,0)
LexToken(SEMICOLON,';',1,4)

But the problem is that Sets is CMDSETS and not ID so how can I compare if its an instruction or a regular ID?

The code:

import ply.lex as lex
import ply.yacc as yacc


tokens = [
    'CMDSETS',
    'CMDUNION',
    'ID',
    'COLON',
    'SEMICOLON',

    ]
t_CMDSETS=r'Sets'
t_CMDUNION=r'Union'
t_COLON= r','
t_SEMICOLON=r';'


def t_ID(t):
    r'[a-zA-Z_][a-zA-Z0-9_]*'
    t.type='ID'
    return t

t_ignore=r' '

def t_error(t):
    print("This thing failed")
    t.lexer.skip(1)

lexer=lex.lex()

lexer.input("Sets;")

while True:
    tok=lexer.token()
    if not tok:
        break
    print(tok)

Upvotes: 1

Views: 1155

Answers (1)

Davis Herring
Davis Herring

Reputation: 39818

The PLY documentation explains this exact case. The superficial answer is that it prefers to match with the regular expression from a function rather than from a variable. But keywords like this don't work anyway: they'd match things like "Setser" and "Unionize". So just check for keywords in t_ID and reset t.type if needed.

Upvotes: 3

Related Questions