Reputation: 12331
I'm trying to implement a python parser using PLY for the Kconfig language used to generate the configuration options for the linux kernel.
There's a keyword called source which performs an inclusion, so what i do is that when the lexer encounters this keyword, I change the lexer state to create a new lexer which is going to lex the sourced file:
def t_begin_source(t):
r'source '
t.lexer.begin('source')
def t_source_path(t):
r'[^\n]+\n+'
t.lexer.begin('INITIAL')
global path
source_lexer = lex.lex(errorlog=lex.NullLogger())
source_file_name = (path + t.value.strip(' \"\n'))
sourced_file = file(path + t.value.strip(' \"\n')).read()
source_lexer.input(sourced_file)
while True:
tok = source_lexer.token()
if not tok:
break
Somewhere else I have this line
lexer = lex.lex(errorlog=lex.NullLogger())
This is the "main" or "root" lexer which is going to be called by the parser.
My problem is that I don't know how to tell the parser to use a different lexer or to tell the "source_lexer" to return something...
Maybe the clone function should be used...
Thanks
Upvotes: 3
Views: 1775
Reputation: 15852
By an interesting coincidence a link from the same Google search that led me to this question explains how to write your own lexer for a PLY parser. The post explains it simply and well, but it's a matter of four instance variables and single token
method.
Upvotes: 2
Reputation: 12331
Ok,
so what i've done is building a list of all the tokens, which is built before the actual parsing.
The parser no longer calls the lexer because you can override the getToken function used by the parser using the tokenfunc parameter when calling the parse function.
result = yacc.parse(kconfig,debug=1,tokenfunc=my_function)
and my function which is now the function called to get the next token iterates over the list of tokens previously built.
Considering the lexing, when I encounter a source keyword, I clone my lexer and change the input to include the file.
def sourcing_file(source_file_name):
print "SOURCE FILE NAME " , source_file_name
sourced_file = file(source_file_name).read()
source_lexer = lexer.clone()
source_lexer.input(sourced_file)
print 'END OF SOURCING FILE'
while True:
tok = source_lexer.token()
if not tok:
break
token_list.append(tok)
Upvotes: 0
Reputation: 375484
I don't know about the details of PLY, but in other systems like this that I've built, it made the most sense to have a single lexer which managed the stack of include files. So the lexer would return a unified stream of tokens, opening and closing include files as they were encountered.
Upvotes: 2