LunaticJape
LunaticJape

Reputation: 1584

Rewrite a c++ variable declaration code through AST

I'm currently trying to rewrite a c++ code from

Query q(
      "SELECT %LC FROM %T WHERE %W",
      getColumnArray(),
      "table_name",
      buildConstraints(id, "abc");

to

auto q = QueryBuilder.from("table_name").select(getColumnArray()).where(id,"abc").build();

I'm able to grab the target code block using

varDecl(isExpansionInMainFile(),hasType(cxxRecordDecl(hasName("Query"))))

However, I think there are additional improvements/requirements I may need help to understand:

  1. Would it be possible to match a Query where the argument contains "SELECT"?
  2. Once I have the target code block, how do I extract the arguments of the variable constructor (e.g., getColumnArray(), "table_name", etc.) to be able to reuse them in the new code?
  3. Would it be possible to get the wrapper function name of the matcher-matched code block?

Upvotes: -1

Views: 125

Answers (1)

Scott McPeak
Scott McPeak

Reputation: 12749

I take it you are running the query in the question using clang-query. But you also state that your goal is to rewrite C++ code, and clang-query cannot do that; it merely finds instances that match a given expression and prints them to the console. To rewrite C++ code, you need to use one of the Clang APIs such as Libtooling for C++ or Python libclang.

Nevertheless, I'll answer the questions for a pure clang-query context. See the AST Matcher Reference for more information (albeit terse) on the individual matchers.

Q1: Filtering on string literals

clang-query cannot filter on the contents of string literals, as asked and answered previously.

Q2a: Getting the constructor arguments

Having matched a varDecl, use hasInitializer(cxxConstructExpr(hasArgument(...))) to get the arguments passed to the constructor.

This assumes there are a fixed number of arguments.

Q2b: Getting arguments for a variable-argument callee

In comments it was clarified that the Query class constructor takes a variable number of arguments, presumably something like:

class Query {
public:
  Query(char const *fmt, ...);
  ...
};

If all of the parameters had names and types, then we could use forEachArgumentWithParam, but they don't. So, instead, we have to use a sequence of optionally(hasArgument(N, ...)) for N up to some large but fixed limit.

Q3: Getting the containing function

Having matched a varDecl, use hasAncestor(functionDecl()) to get the definition of the containing function.

Example query for Q2 and Q3

Here is a shell script that runs clang-query and demonstrates the answers to Q2 and Q3:

#!/bin/sh

PATH=$HOME/opt/clang+llvm-16.0.0-x86_64-linux-gnu-ubuntu-18.04/bin:$PATH

query='m

  varDecl(
    isExpansionInMainFile(),
    hasType(cxxRecordDecl(hasName("S"))),
    hasInitializer(
      cxxConstructExpr(
        optionally(
          hasArgument(0,
            expr().bind("arg0")
          )
        ),
        optionally(
          hasArgument(1,
            expr().bind("arg1")
          )
        ),
        optionally(
          hasArgument(2,
            expr().bind("arg2")
          )
        ),
        optionally(
          hasArgument(3,
            expr().bind("arg3")
          )
        ),
        optionally(
          hasArgument(4,
            expr().bind("arg4")
          )
        ),
        optionally(
          hasArgument(5,
            expr().bind("arg5")
          )
        ),
        optionally(
          hasArgument(6,
            expr().bind("arg6")
          )
        )
      )
    ),
    hasAncestor(
      functionDecl().bind("containingFunction")
    )
  ).bind("varDecl")

'

if [ "x$1" = "x" ]; then
  echo "usage: $0 filename.cc -- <compile options like -I, etc.>"
  exit 2
fi

# Run the query.  Setting 'bind-root' to false means clang-query will
# not also print a redundant "root" binding.
clang-query \
  -c="set bind-root false" \
  -c="$query" \
  "$@"

# EOF

When run on the input:

struct S {
  S(...);
};

void f()
{
  S s1(0,1);
  S s2(0,1,2,3,4,5,6,7,8,9);
}

it produces the output:

$ ./cmd.sh test.cc --

Match #1:

$PWD/test.cc:7:8: note: "arg0" binds here
  S s1(0,1);
       ^
$PWD/test.cc:7:10: note: "arg1" binds here
  S s1(0,1);
         ^
$PWD/test.cc:5:1: note: "containingFunction" binds here
void f()
^~~~~~~~
$PWD/test.cc:7:3: note: "varDecl" binds here
  S s1(0,1);
  ^~~~~~~~~

Match #2:

$PWD/test.cc:8:8: note: "arg0" binds here
  S s2(0,1,2,3,4,5,6,7,8,9);
       ^
$PWD/test.cc:8:10: note: "arg1" binds here
  S s2(0,1,2,3,4,5,6,7,8,9);
         ^
$PWD/test.cc:8:12: note: "arg2" binds here
  S s2(0,1,2,3,4,5,6,7,8,9);
           ^
$PWD/test.cc:8:14: note: "arg3" binds here
  S s2(0,1,2,3,4,5,6,7,8,9);
             ^
$PWD/test.cc:8:16: note: "arg4" binds here
  S s2(0,1,2,3,4,5,6,7,8,9);
               ^
$PWD/test.cc:8:18: note: "arg5" binds here
  S s2(0,1,2,3,4,5,6,7,8,9);
                 ^
$PWD/test.cc:8:20: note: "arg6" binds here
  S s2(0,1,2,3,4,5,6,7,8,9);
                   ^
$PWD/test.cc:5:1: note: "containingFunction" binds here
void f()
^~~~~~~~
$PWD/test.cc:8:3: note: "varDecl" binds here
  S s2(0,1,2,3,4,5,6,7,8,9);
  ^~~~~~~~~~~~~~~~~~~~~~~~~
2 matches.

In the example, it stops at six arguments because that is as far as the query goes, but more optionally(hasArgument(N, ...)) can be added as needed.

Upvotes: 1

Related Questions