My WordCloud is missing the letter 's' at the end of words

Question

At first I thought the problem is with my data and that I made a mistake while cleaning the data. However I checked it and that is not the case.

I am using this code:

import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

allWords = ' '.join([twts for twts in df['full_text']])
wordCloud = WordCloud(collocations=True, width = 1000,
height=600, random_state = 21, max_font_size = 120).generate(allWords)

plt.imshow(wordCloud, interpolation = "bilinear")
plt.axis('off')
plt.show()

Now my wordcloud shows words like "coronaviru", "viru", "crisi".With collocations=True it shows the full words in combination with other words like "coronavirus case" "coronavirus pandemic". Does anyone know how to fix this? Like I said, I checked the data and it is always the correct full word there. So I guess the mistake happens with the wordcloud.

My data looks like this:

    created_at                        id                full_text
0   Sat Aug 01 00:25:53 +0000 2020    28934685093219    life is hard with coronavirus
1   Sat Aug 01 00:25:53 +0000 2020    28934685093219    coronavirus sucks

JustLinh · Accepted Answer

You would need to change a parameter in the WordCloud function: normalize_plurals=False. Reference: https://amueller.github.io/word_cloud/generated/wordcloud.WordCloud.html

normalize_plurals: bool, default=True. Whether to remove trailing ‘s’ from words. If True and a word appears with and without a trailing ‘s’, the one with trailing ‘s’ is removed and its counts are added to the version without trailing ‘s’ – unless the word ends with ‘ss’. Ignored if using generate_from_frequencies.

My WordCloud is missing the letter 's' at the end of words

Answers (2)

Related Questions

My WordCloud is missing the letter &#39;s&#39; at the end of words

Answers (2)

Related Questions

My WordCloud is missing the letter 's' at the end of words