How to convert a tokenized dataframe to string to generate a wordcloud

Question

So I'm reading an excel file to a dataframe and then normalizing it ( lowercase, stopwords..etc)

Now my dataframe has multiple columns from the excel file but only the ones I needed and it looks something like below. I had to tokenize it.

df['col1']

0 [this, is , fun, interesting]
1 [this, is, fun, too]
2 [ even, more, fun]

I have more similar columns like df['col2'] and so on.

Now I want to generate a word cloud

from wordcloud import WordCloud
text = WordCloud().generate(df['col'])
plt.imshow(text)
plt.axis("off")
plt.show()

I'm trying to generate a wordcloud but this isn't working since apparently word cloud expects a string. How do I convert my entire dataframe to string?

I want to convert entire dataframe to string and then generate a wordcloud but if that's not possible then atleast a wordcloud per column would be nice.

b-fg · Accepted Answer

You just need to convert your columns to string as so far you only have a list of strings which WordCloud cannot take. Simply,

text = WordCloud().generate(df['col1'].to_string())

And your output image is

How to convert a tokenized dataframe to string to generate a wordcloud

Answers (2)

Related Questions