user3260312
user3260312

Reputation: 241

Translator database schema

I want to make a translator(NOT DICTIONARY) from English-AnyLanguage. And I prepared design and functionality of site and the last thing left is database schema. My question is how should the database table should look?

Note: It will be something like translate.google.com but with 2 languages to choose only. And also I use asp.net with c#. If there are any links or suggestions please write them in the comments or answer...

How it will work:

  1. User enter's the text
  2. Click's the translate button
  3. XMLHttpRequest will be sent to IHttpHandler
  4. Translate class will translate the text that been sent by XMLHttpRequest
  5. IHttpHandler will Response the translated text

Upvotes: 0

Views: 916

Answers (2)

user2316116
user2316116

Reputation: 6814

Machine translation (MT) or natural language processing (NLP) is much more complex than just a database structure. If that's the only "thing that's left", then you're set on a wrong path. MT requires to know grammar, semantics, facts about the real world, etc. What the other answer describes is a dictionary, which helps, but is not sufficient for translation in context.

Example: direct word translation is often ambiguous

"Open" in English is used for a variety of concepts – job offerings, "Now Open" in front of a store, open questions, etc. See http://wordnetweb.princeton.edu/perl/webwn?s=open with 11 verbs + 21 adjectives + 4 nouns. A translation to other language e.g. German, will result to different words. See dictionary entries for that word at http://dict.leo.org/ende/index_en.html#/search=open There are 158 verbs + 85 adjectives + 1.344 nouns

Essentially requires a good understanding of the source text (semantics), and understanding how the same (or similar) situation is to be described in the target language.

Before designing a database you should find a concept of translation. For example it could be following flow

  1. Segmenting documents into sentences and words
  2. Reduction of word forms to their canonical form and dictionary lookup
  3. Recognizing sentential structures (getting grammar)
  4. Assigning translations to single words
  5. Generating the structure of target sentences
  6. Generating word forms
  7. Adding layout information

[Source]

This would help you to understand that for step 2 you need a table for all canonical forms (e.g. goes=go) or a table to describe rules for these forms. Step 3 requires a table for grammar rules. Step 4 requires a table to associate words into word groups. Step 5 requires... etc.

You can take as an example the structure of WordNet database but this will not bring you that much because the way you will use to translate is unknown.

Upvotes: 3

umlcat
umlcat

Reputation: 4143

This is from some Bussines app. that worked in real life.

// catatog table for languages

Languages = ([PK] LanguageID, LanguageTitle, LanguageDescr);

Example:

(2, 'French', 'France National Language (French)')
(3, 'Spanish', 'Spain National Language (Spanish Catalan)')

// catalog table for words in english, remember, // words in singular aren't always similar to words in plural, like "feet", "foot"

Words = ([PK] WordID, WordSingular, WordPlural, WordSingularFemale, WordPluralFemale);

Example:

(1, 'Feet', 'Foot', 'Feet', 'Foot')
(2, 'Man', 'Men', 'Woman', 'Women')
(3, 'Leaf', 'Leaves', 'Leaf', 'Leaves')
(4, 'Car', 'Cars', 'Car', 'Cars')
(5, 'Lion', 'Lions', 'Lionness', 'Lionnesses')

// Work Table for words for each language

WordPerLang = ([PK] WPLID, [FK] LanguageID, [FK] WordID, WPLWordSingular,WPLWordPlural,
WPLWordSingularFemale, WPLPluralFemale);

Example:

(1, 3 /* Spanish */, 1 /* Feet */, 'Pie', 'Pies', 'Pie', 'Pies')
(1, 5 /* Spanish */, 1 /* Lion */, 'Leon', 'Leones', 'Leona', 'Leonas')

Cheers.

Upvotes: 0

Related Questions