Menna Serag
Menna Serag

Reputation: 1

I'm just trying to prove something

I'm working on a vulnerability detection tool using Tree-sitter parsers. I have a shared library (.so file) built from multiple Tree-sitter language repositories, and I'm using it to parse code for vulnerabilities. However, the detection results are inconsistent—some known vulnerabilities are missed, while others are falsely flagged.

Here's what I've done so far:

Cloned and built the Tree-sitter repositories using a build.sh script that runs build.py. Loaded the .so file in my Python script using ctypes. Parsed code snippets and tried to extract vulnerability-related tokens. Issues I'm facing:

Some vulnerability patterns aren't detected, even though they exist in the dataset. The output sometimes includes false positives. I suspect the parsing process isn’t working as expected, but I’m not sure how to debug it. Questions:

How can I verify if my .so file correctly contains all required Tree-sitter grammars? Is there a way to debug how Tree-sitter is parsing my code to check if the right nodes are being matched? Could this issue be due to how I'm calling the Tree-sitter API from Python?

Upvotes: -4

Views: 43

Answers (0)

Related Questions