Reputation: 1601
really struggling with this one.
I've recently setup a bash shell script to extract, concat and deduplicate the strings to translate out of a whole webite's view pages (MVC framework in use), it looks something like this:
for x in *.php; do xgettext --no-wrap --language=PHP -e --flag=_:1:pass-c-format -a "$x" -o "${x%.php}.pot"; done
msgcat -u -s --output-file=$WEBSITENAME-concat.pot *.pot
msguniq -u --output-file=$WEBSITENAME-unique.pot $WEBSITENAME-concat.pot
msgmerge -s -v -U $WEBSITENAME.po $WEBSITENAME-unique.pot
The above is working absolutely fine apart from these 2 things in order of difficulty to overcome:
Throughout all the website source code I've been careful to ensure all the strings which need translating are surrounded by the function _( 'string to translate here' ), but the xgettext command is extracting pretty much every string in the file from what I can tell, not just the ones surrounded by the _('') function. This means my resulting .pot file contains variable names, URL's, format strings, function parameters, configuration data and other inappropriate strings which should not be passed onto our translators. Due to the size of the website it isn't practical to manually remove these - we're looking at nearly 80,000 string entries and this is just the first website of a number I'll need to process in the same way within the next 6 weeks! How can xgettext be configured to only extract the strings intended for translation?
A lot of the strings extracted have line breaks in them, which are inserted as \n within the strings. Is there some way to configure xgettext to not do this, or an easy way to remove these?
I've been reading through the documentation and searching the web for hours even days trying to find a solution particularly to problem no. 1, and would really appreciate some help from the gettext gurus! Thanks in advance..
Upvotes: 3
Views: 1252
Reputation: 237
Possible answer to point 1.
I don't know about the version you are using, but with the Delphi one, you can add a file called ggexclude.cfg to exclude some components.
# exclude all occurences of the specified class
# and property in all DFM files in or below the
# path where "ggexclude.cfg" is in
[exclude-form-class-property]
TField.FieldName
...
Upvotes: 0
Reputation: 2128
Just guessing here, but the first issue you are experiencing might be caused by the -a option. From the xgettext manual:
-a, --extract-all extract all strings
As a side note, your xgettext invocation seems quite complex. I, of course, don't known exactly what you want to do, but for me the following command is sufficient:
xgettext -L PHP --from-code=utf-8 *.php -o messages.pot
This will save all _()
enclosed strings into messages.pot.
Upvotes: 4