Reputation: 5420

Get the most possible token types according to line and column number in ANTLR4

I would like to get a list of most possible list of tokens for a given location in the text (line and column number) to determine what has to be populated for auto code completion. Can this be easily achieved using ANTLR 4 API.

I want to get the possible list of tokens for a given location because the user might be writing/editing somewhere in the middle of the text which still guarantees the possible list of tokens.

Please give me some guidelines because I was unable to find an online resource on this topic.

Upvotes: 3

Answers (1)

mounds

Reputation: 1383

One way to get tokens by line number is to create a ParseTreeListener for your grammar, use it to walk a given ParseTree and index TerminalNodes by line number. I don't know C#, but here is how I've done it in Java. Logic should be similar.

public class MyLineIndexer extends MyGrammarParserBaseListener {

protected MultiMap<Integer, TerminalNode> filelineTokenIndex;

@Override
public void visitTerminal(@NotNull TerminalNode node) {
    // map every token to its file line for searching later...

    if ( node.getSymbol() != null ) {
        List<TerminalNode> tokens;
        Integer line = node.getSymbol().getLine();
        if (!filelineTokenIndex.containsKey(line)) {
            tokens = new ArrayList<>();
            filelineTokenIndex.put(line, tokens);
        } else {
            tokens = filelineTokenIndex.get(line);
        }
        tokens.add(node);
    }
    super.visitTerminal(node);
}
}

then walk the parse tree the usual way...

ParseTree parseTree = ... ; // parse it how you want to
MyLineIndexer indexer = new MyLineIndexer();
ParseTreeWalker walker = new ParseTreeWalker();
walker.walk(indexer, parseTree);

Getting the token at a line and range is now reasonably straight forward and efficient assuming you have a reasonable number of tokens on a line. For example you can add another method to the Listener like this:

public TerminalNode findTerminalNodeAtCaret(int caretPos, int caretLine) {
    if (caretPos <= 0) return null;

    if (this.filelineTokenIndex.containsKey(caretLine)) {
        List<TerminalNode> nodes = filelineTokenIndex.get(caretLine);

        if (nodes.size() == 0) return null;

        int tokenEndPos, tokenStartPos;

        for (TerminalNode n : nodes) {
            if (n.getSymbol() != null) {
                tokenEndPos = n.getSymbol().getCharPositionInLine() + n.getText().length();
                tokenStartPos = n.getSymbol().getCharPositionInLine();
                // If the caret is within this token, return this token
                if (caretPos >= tokenStartPos && caretPos <= tokenEndPos) {
                    return n;
                }
            }
        }
    }
    return null;
}

You will also need to ensure your parser allows for 'loose' parsing. While a language construct is being typed, it is likely not to be valid. Your Parser rules should allow for this.

Upvotes: 1

Get the most possible token types according to line and column number in ANTLR4

Answers (1)

Related Questions