Reputation: 1247
I'm writing a .ctags file for a custom language... Like most languages, it allows for multiple variable declarations in one line.. i.e.:
int a, b, c;
I have a basic regex which recognizes 'a':
--regex-mylang=/^[ \t]*int[ \t]*([a-zA-Z_0-9]+)/\1/v,variable/
How do I modify this to have it match 'b' and 'c', as well? I can't find anything in ctags documentation that deals with multiple matches in a single line.
Upvotes: 9
Views: 3437
Reputation: 1300
The latest Universal-ctags can capture them.
[jet@localhost]/tmp% cat input.x
int a, b, c;
[jet@localhost]/tmp% cat x.ctags
--langdef=X
--map-X=.x
--kinddef-X=v,var,variables
--_tabledef-X=main
--_tabledef-X=vardef
--_mtable-regex-X=main/int[ \t]+//{tenter=vardef}
--_mtable-regex-X=main/.//
--_mtable-regex-X=vardef/([a-zA-Z0-9]+)/\1/v/
--_mtable-regex-X=vardef/;//{tleave}
--_mtable-regex-X=vardef/.//
[jet@localhost]/tmp% u-ctags --options=x.ctags -o - ./input.x
a ./input.x /^int a, b, c;$/;" v
b ./input.x /^int a, b, c;$/;" v
c ./input.x /^int a, b, c;$/;" v
See https://docs.ctags.io/en/latest/optlib.html#advanced-pattern-matching-with-multiple-regex-tables for more details.
Upvotes: 7
Reputation: 341
it can be partialy done with the Universal Ctags and with the help of {_multiline=N}
and {scope}
flag. The N
is group number which position is saved in generated tags
file.
For more information look here: docs/optlib.rst
Configuration: mylang.ctags
--langmap=mylang:.txt
--regex-mylang=/^[[:blank:]]*(int)[[:blank:]]/\1/{placeholder}{scope=set}{_multiline=1}
--regex-mylang=/(;)/\1/{placeholder}{scope=clear}
--regex-mylang=/[[:blank:]]*([[:alnum:]]+)[[:blank:]]*,?/\1/v,variable/{_multiline=1}{scope=ref}
Test file: test.txt
void main() {
int a, b, c, d;
}
Generate tags with: ctags --options=mylang.ctags test.txt
Generated tags
file:
!_TAG_FILE_FORMAT 2 /extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED 1 /0=unsorted, 1=sorted, 2=foldcase/
!_TAG_OUTPUT_MODE u-ctags /u-ctags or e-ctags/
!_TAG_PROGRAM_AUTHOR Universal Ctags Team //
!_TAG_PROGRAM_NAME Universal Ctags /Derived from Exuberant Ctags/
!_TAG_PROGRAM_URL https://ctags.io/ /official site/
!_TAG_PROGRAM_VERSION 0.0.0 /cb4476eb/
a test.txt /^ int a, b, c, d;$/;" v
b test.txt /^ int a, b, c, d;$/;" v
c test.txt /^ int a, b, c, d;$/;" v
d test.txt /^ int a, b, c, d;$/;" v
int test.txt /^ int a, b, c, d;$/;" v
main test.txt /^void main() {$/;" v
void test.txt /^void main() {$/;" v
Upvotes: 0
Reputation: 36549
After going through this for a few hours, I'm convinced it can't be done. No matter what, the regular expression will only expand to one tag per line. Even if you put \1 \2 \3 ... as the expansion, that would just cause a tag consisting of multiple matches, instead of one tag per match.
It parses the C example correctly because inside the ctags source code it uses an actual code parser, not a regexp.
Upvotes: 2
Reputation: 1920
--regex-perl=/^\s*?use\s+(\w+[\w\:]*?\w*?)/\1/u,use,uses/
--regex-perl=/^\s*?require\s+(\w+[\w\:]*?\w*?)/\1/r,require,requires/
--regex-perl=/^\s*?has\s+['"]?(\w+)['"]?/\1/a,attribute,attributes/
--regex-perl=/^\s*?\*(\w+)\s*?=/\1/a,aliase,aliases/
--regex-perl=/->helper\(\s?['"]?(\w+)['"]?/\1/h,helper,helpers/
--regex-perl=/^\s*?our\s*?[\$@%](\w+)/\1/o,our,ours/
--regex-perl=/^=head1\s+(.+)/\1/p,pod,Plain Old Documentation/
--regex-perl=/^=head2\s+(.+)/-- \1/p,pod,Plain Old Documentation/
--regex-perl=/^=head[3-5]\s+(.+)/---- \1/p,pod,Plain Old Documentation/
Upvotes: -4
Reputation: 5325
You're trying to do parsing with a regex, which is not generally possible. Parsing requires the equivalent of storing information on a stack, but a regular expression can embody only a finite number of different states.
Upvotes: 0