Reputation: 990
I'd like to get babel parsing a file and find out translation strings that are simply starting with:
(_
and ending to:
)
So the translation part in the file.myext
could be:
(_ "message")
String literals are always starting and ending with double quote (").
There are some references of doing it on: http://babel.pocoo.org/en/latest/messages.html?highlight=parser
But this seems overwhelmingly complicated thing. Can someone provide a simple example to achieve own message extractor for babel with above constrains?
I can find Jinja2 parser from: https://github.com/pallets/jinja/blob/99498320871a290f5799d4f96a7774fc8a34381e/jinja2/ext.py
But huh?!
Also Django project has their own extractor: https://github.com/python-babel/django-babel/blob/master/django_babel/extract.py
Upvotes: 0
Views: 212
Reputation: 4782
The reason these appear complex is because they use lexical analysis (aka "lexers") to parse the inputs and find the strings. This may seem overly complicated, but it's a very mature area of computer science, and the right tool for the job. Most beginners start with regular expressions and custom code for this kind of task, and, if they persist and learn from what's available, end up with a lexer and parser.
For your own definition, you are looking for:
(
_
"
"
)
This is a great problem for the many lexing / parsing libraries in Python and will be a perfect way to introduce yourself to this technology.
You'll also want to consider some other cases:
(_ 'single quotes')
(_ '''multi
line
quotes''')
(_ "strings with \"escaped quotes\".")
(_ "strings with 'mixed quotes'")
(_ "strings that are just wrong')
Upvotes: 1