Parsing Linux source into abstract syntax tree

I'd like to perform source code analysis of Linux kernel, but to do that, I first need to parse it. What are my options? I'd prefer an AST usable from python, but any other language is ok too.

Apparently CIL is able to parse whole kernel, but it's not clear from the website, how to do that.

Upvotes: 1

Answers (3)

kernelUser

Reputation: 21

You can check the page Parsing Kernel about tools comparision. The winner seems to be KDevelop.

Regards,

Upvotes: 2

SK-logic

Reputation: 9725

Do you really need an AST? Or a lower level intermediate representation would be just enough? For both options, you can use Clang, and either analyse its AST (sadly, with C++ only) or an LLVM IR.

CIL is also an option, but you'd need to write your analysis tool in OCaml. cilly is its drop-in replacement for gcc, but it might need some hacking for using it with such a non-trivial build sequence as the Linux kernel. Just using --merge won't be sufficient.

Upvotes: 1

sarnold

Reputation: 104080

I'd recommend starting with the sparse static analysis tool. Because sparse was designed specifically to assist the kernel developers in performing static analysis on the kernel, you can have some level of assurance that it really ought to parse the combination of C99 and GNU extensions that are used in the kernel sources. The code I've examined looked clean and straight forward but I never tried to extend it in any fashion. The Documentation/sparse.txt file has a very short synopsis of using sparse on the kernel sources, if you want a very high-level overview.

Another option is GCC MELT, a tool designed to make it easier to build plugins for the gcc compiler. Using it would require knowing enough gcc internals to find your way around, but MELT does look far easier than coding a similar plugin directly in C.

Upvotes: 2

Parsing Linux source into abstract syntax tree

Answers (3)

Related Questions