bgutt3r
bgutt3r

Reputation: 568

Find conditional evaluation in for loop using libclang

I'm using clang (via libclang via the Python bindings) to put together a code-review bot. I've been making the assumption that all FOR_STMT cursors will have 4 children; INIT, EVAL, INC, and BODY..

for( INIT; EVAL; INC )
    BODY;

which would imply that I could check the contents of the evaluation expression with something in python like:

forLoopComponents = [ c for c in forCursor.get_children() ]
assert( len( forLoopComponents ) == 4 )

initExpressionCursor = forLoopComponents[ 0 ]
evalExpressionCursor = forLoopComponents[ 1 ]
incExpressionCursor = forLoopComponents[ 2 ]
bodyExpressionCursor = forLoopComponents[ 3 ]

errorIfContainsAssignment( evalExpressionCursor ) # example code style rule

This approach seems...less than great to begin with, but I just accepted it as a result of libclang, and the Python bindings especially, being rather sparse. However I've recently noticed that a loop like:

for( ; a < 4; a-- )
    ;

will only have 3 children -- and the evaluation will now be the first one rather than the second. I had always assumed that libclang would just return the NULL_STMT for any unused parts of the FOR_STMT...clearly, I was wrong.

What is the proper approach for parsing the FOR_STMT? I can't find anything useful for this in libclang.

UPDATE: Poking through the libclang source, it looks like these 4 components are dumbly added from the clang::ForStmt class using a visitor object. The ForStmt object should be returning null statement objects, but some layer somewhere seems to be stripping these out of the visited nodes vector...?

Upvotes: 3

Views: 369

Answers (1)

Robert
Robert

Reputation: 63

The same here, as a workaround I replaced the first empty statement with a dummy int foo=0 statement. I can imagine a solution, which uses Cursor's get_tokens to match the parts of the statement. The function get_tokens can help in situations, where clang is not enough.

Upvotes: 1

Related Questions