user103987
user103987

Reputation: 65

wordcloud for non-english corpus

wordcloud for non English text

Dear friends I am facing problems in generating proper wordcloud for non english text. The cloud is generated but it gives un-satisfactroy results. It shows wordcloud with characters only while I require wordcloud with proper words. I processed following code to generate wordcloud.

from os import path
from scipy.misc import imread
import matplotlib.pyplot as plt
import random
import unicodedata
from wordcloud import WordCloud, STOPWORDS
text = scorpus
wordcloud = WordCloud(font_path='MBKhursheed.ttf',
                      relative_scaling = 1.0,
                      stopwords = sw
                      ).generate(text)
plt.imshow(wordcloud)
plt.axis("off")
plt.show()

Upvotes: 5

Views: 2201

Answers (1)

M. Chavoshi
M. Chavoshi

Reputation: 1021

first you need to import (possibly install first) these two:

from arabic_reshaper import arabic_reshaper
from bidi.algorithm import get_display

then use it as the following:

text = get_display(arabic_reshaper.reshape(text))
wordcloud = WordCloud(font_path='MBKhursheed.ttf',
                      relative_scaling = 1.0,
                      stopwords = sw
                      ).generate(text)

Upvotes: 4

Related Questions