Me All
Me All

Reputation: 269

Creating a dictionary in Python and using it to translate a word

I have created a Spanish-English dictionary in Python and I have stored it using the variable translation. I want to use that variable in order to translate a text from Spanish into English. This is the code I have used so far:

from corpus.nltk import swadesh
import my_books

es2en = swadesh.entries(['es', 'en'])
translation = dict(es2en)

for sentence in my_books.sents("book_1"):
    for word in my_books.words("book_1"):
        if word in es2en:
            print(translation, end= " ")
        else:
            print("unknown_word", end= " ")
    print("")

My problem is that none of the words in book_1 is actually translated into English, so I get a text full of unknown word. I think I'm probably using translation in the wrong way... how could I achieve my desired result?

Upvotes: 0

Views: 3461

Answers (3)

Wahyu Bram
Wahyu Bram

Reputation: 425

i am in progress build a translate machine (language dictionary).

it's in bahasa (indonesia) to english and vice versa.

I build it from zero, what i'm doing is collecting all words in bahasa, and the means of the words.

then compare it with wordnet database (crawl it).

after have a group of meaning and already pairing / grouping the meaning in english with the bahasa, do this, collecting ad many as data, separate it, scienting content and daily content.

tokenize all data in to sentence, make a calculation which word is more high probabilty pairing with other word (both in bahasa and english), this is needed because every words could have several means. this calculation use to choose which word you will use.

example in bahasa: 'bisa', could means poison in bahasa and high probability pair with snake or bite 'bisa', could means can do something in bahasa, high probabilty pairing with verbs words or expression of willing to do something (verbs)

so if the tokenize result pairing with snake or bite, you search the similar meaning in answer by checking snake and poison in english. and search in english database, and you will found venom always pair with snake(have similar means with toxin / poison).

another group can do by words type (nouns, verbs, adjective, etc).

bisa == poison (noun)

bisa == can (verbs).

that's it. after have the calculation, you don't need the data base, you only need word matching data. so the calcultaion that you can do by checking online data (ex: wikipedia) or download it or use bible/book file or any other database that contains lots of sentence.

Upvotes: 0

aghast
aghast

Reputation: 15300

The .entries() method, when given more than one language, returns not a dictionary but a list of tuples. See here for an example.

You need to convert your list of pairs (2-tuples) into a dictionary. You are doing that with your translation = statement.

However, you then ignore the translation variable, and check for if word in es2en:

You need to check if the word is in translation, and subsequently look up the correct translation, instead of printing the entire dictionary.

Upvotes: 2

Victor Vasiliev
Victor Vasiliev

Reputation: 129

It can be a 'Case Sensitivity' issue. For Example: If a dict contain a key 'Bomb' and you will look for 'bomb', it won't be found. Lower all the keys at es2en and then look for:word.lower() in es2en

Upvotes: 0

Related Questions