Reputation: 85
I have a problem to print an Arabic text in Python, I write a code with convert English characters into Arabic ones as is called (chat language or Franco Arabic) and then create a combination between different results to get suggestions based on user input.
def transliterate(francosentence, verbose=False):
francowords = francosentence.split()
arabicconvertedwords = []
for i in francowords:
rankeddata=[]
rankeddata=transliterate_word(i)
arabicconvertedwords.append(rankeddata)
for index in range(len(rankeddata)):
print rankeddata[index]
ran=list(itertools.product(*arabicconvertedwords))
for I in range(len(ran)):
print ran[I]
The first print (print rankeddata[index]) gives Arabic words, but after the combination process is executed the second print (print ran[I]) gives something like that: (u'\u0627\u0646\u0647', u'\u0631\u0627\u064a\u062d', u'\u0627\u0644\u062c\u0627\u0645\u0639\u0647')
How can I print Arabic words?
Upvotes: 1
Views: 1439
Reputation: 155584
Your second loop is operating over tuple
s of unicode
(product
yields a single product at a time as a tuple
), not individual unicode
values.
While print
uses the str
form of the object printed, tuple
's str
form uses the repr
of the contained objects, it doesn't propagate "str-iness" (technically, tuple
lacks __str__
entirely, so it's falling back to __repr__
).
If you want to see the Arabic, you need to print the elements individually or concatenate them so you're printing strings, not tuple
. For example, you could change:
print ran[I]
to something like:
print u', '.join(ran[I])
which will convert to a single comma-separated unicode
value that print
will format as expected (the str
form), rather than using the repr
form with escapes for non-ASCII values.
Side-note: As a point of style (and memory use), use the iterator protocol directly, don't list
ify everything then use C-style indexing loops. The following code has to store a ton of stuff in memory if the inputs are large (the total size of the output is the multiplicative product of the lengths of each input):
ran=list(itertools.product(*arabicconvertedwords))
for I in range(len(ran)):
print u', '.join(ran[I])
where it could easily produce just one item at a time on demand, producing results faster with no memory overhead:
# Don't listify...
ran = itertools.product(*arabicconvertedwords)
for r in ran: # Iterate items directly, no need for list or indexing
print u', '.join(r)
Upvotes: 3