Reputation: 568
I'm using clang (via libclang via the Python bindings) to put together a code-review bot. I've been making the assumption that all FOR_STMT cursors will have 4 children; INIT, EVAL, INC, and BODY..
for( INIT; EVAL; INC )
BODY;
which would imply that I could check the contents of the evaluation expression with something in python like:
forLoopComponents = [ c for c in forCursor.get_children() ]
assert( len( forLoopComponents ) == 4 )
initExpressionCursor = forLoopComponents[ 0 ]
evalExpressionCursor = forLoopComponents[ 1 ]
incExpressionCursor = forLoopComponents[ 2 ]
bodyExpressionCursor = forLoopComponents[ 3 ]
errorIfContainsAssignment( evalExpressionCursor ) # example code style rule
This approach seems...less than great to begin with, but I just accepted it as a result of libclang, and the Python bindings especially, being rather sparse. However I've recently noticed that a loop like:
for( ; a < 4; a-- )
;
will only have 3 children -- and the evaluation will now be the first one rather than the second. I had always assumed that libclang would just return the NULL_STMT for any unused parts of the FOR_STMT...clearly, I was wrong.
What is the proper approach for parsing the FOR_STMT? I can't find anything useful for this in libclang.
UPDATE: Poking through the libclang source, it looks like these 4 components are dumbly added from the clang::ForStmt class using a visitor object. The ForStmt object should be returning null statement objects, but some layer somewhere seems to be stripping these out of the visited nodes vector...?
Upvotes: 3
Views: 369
Reputation: 63
The same here, as a workaround I replaced the first empty statement with a dummy int foo=0 statement. I can imagine a solution, which uses Cursor's get_tokens to match the parts of the statement. The function get_tokens can help in situations, where clang is not enough.
Upvotes: 1