Reputation: 166
I am trying to build a deep learning model with transformer model architecture. In that case when I am trying to cleaning the dataset following error occurred.
I am using Pytorch and google colab for that case & trying to clean Java methods and comment dataset.
Tested Code
import re
from fast_trees.core import FastParser
parser = FastParser('java')
def get_cmt_params(cmt: str) -> List[str]:
'''
Grabs the parameter identifier names from a JavaDoc comment
:param cmt: the comment to extract the parameter identifier names from
:returns: an array of the parameter identifier names found in the given comment
'''
params = re.findall('@param+\s+\w+', cmt)
param_names = []
for param in params:
param_names.append(param.split()[1])
return param_name
Occured Error
Downloading repo https://github.com/tree-sitter/tree-sitter-java to /usr/local/lib/python3.7/dist-packages/fast_trees/tree-sitter-java.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-31-64f6fa6ed39b> in <module>()
3 from fast_trees.core import FastParser
4
----> 5 parser.set_language = FastParser('java')
6
7 def get_cmt_params(cmt: str) -> List[str]:
3 frames
/usr/local/lib/python3.7/dist-packages/fast_trees/core.py in FastParser(lang)
96 }
97
---> 98 return PARSERS[lang]()
/usr/local/lib/python3.7/dist-packages/fast_trees/core.py in __init__(self)
46
47 def __init__(self):
---> 48 super().__init__()
49
50 def get_method_parameters(self, mthd: str) -> List[str]:
/usr/local/lib/python3.7/dist-packages/fast_trees/core.py in __init__(self)
15 class BaseParser:
16 def __init__(self):
---> 17 self.build_parser()
18
19 def build_parser(self):
/usr/local/lib/python3.7/dist-packages/fast_trees/core.py in build_parser(self)
35 self.language = Language(build_dir, self.LANG)
36 self.parser = Parser()
---> 37 self.parser.set_language(self.language)
38
39 # Cell
ValueError: Incompatible Language version 13. Must not be between 9 and 12
an anybody help me to solve this issue?
Upvotes: 2
Views: 1170
Reputation: 226
The fast-trees
library uses the tree-sitter
library and since they recommended using the 0.2.0
version of tree-sitter
in order to use fast-trees
. Although downgrade the tree-sitter
to the 0.2.0
version will not be resolved your problem. I also tried out it by downgrading it.
So, without investing time to figure out the bug in tree-sitter
it is better to move to another stable library that satisfies your requirements. So, as your requirement, you need to extract features from a given java code. So, you can use javalang
library to extract features from a given java code.
javalang
is a pure Python library for working with Java source code.javalang
provides a lexer and parser targeting Java 8. The implementation is based on the Java language spec available at http://docs.oracle.com/javase/specs/jls/se8/html/.
you can refer it from - https://pypi.org/project/javalang/0.13.0/
Since javalang
is a pure library it will help go forward on your research without any bugs
Upvotes: 2
Reputation: 477
fast_trees
uses tree_sitter
and according to tree_sitter
repo it is an incomatibility issue. If you know the owner of fast_trees
ask them to upgrade their tree_sitter
version.
Or you can fork it and upgrade it yourself, but keep in mind it may not be backwards compatible if you take it upon yourself and it may not be just a simple new version install.
Upvotes: 2