Reputation: 63
How exactly is the support for glossaries via Google translate_v3beta1 API supposed to work? I've been searching for this information, but haven't found it. Should the terms in glossary simply override all other potential translations, or do they just add priority for translations in the glossary, but the engine can still use other translations if they "fit better" according to training data?
I've created a glossary using Python libraries (I've tried two different ways - from tsv using "language_pair" property and also from csv using "language_codes_set" property). Then I tried to use the glossary to override a translation of one term in a text string produced by a custom MT model (ie. without using a glossary, the engine translated a term one way and I tried to use the glossary to force it to use another translation for that term), but without success.
Now I'm not sure, if I made some mistake when creating or using the glossary (currently I don't know about any issue in my code), or if there is no mistake on my side, but based on the training data the engine simply used another translation. From experience with other platforms for custom machine translation I know, that some of them use glossaries to override the translations and some of them use them only to prioritize the glossary terms, but not totally override all other potential translations. Therefore I want to clarify this simple question first, before starting to search for other possible reasons, why my code doesn't work as expected.
Thank you in advance.
Upvotes: 2
Views: 2631
Reputation: 63
OK, so I got the answers elsewhere. The translations from glossaries should override the terms used by the model. There was another reason why it didn't work for me (I was using incorrect field of the response - "translation" instead of "glossaryTranslation").
EDIT as per Suresh's request in comments:
Details about the Google API response related to the above issue are here, when using the Python client library see this.
Code sample snippet with the use of Python client library:
response = client.translate_text(
contents = [source_text],
parent = parent,
mime_type = 'text/html',
source_language_code = source_language,
target_language_code = target_language,
model = model_id,
glossary_config = glossary_config,
timeout = 90
)
google_translation = response.glossary_translations[0].translated_text # Here "glossary_translations" must be used instead of "translations", if you use glossary.
Upvotes: 3