firmo23
firmo23

Reputation: 8444

Word with very big frequency compared to the rest, makes the other words almost invisble in worcloud

In the wordcloud below I have an issue the word "oil" has much bigger frequency than the rest of the words so is displayed much bigger and as matter of fact the rest of the words cannot be seen. How can I face this issue? Is there a zoom option or something like this?Or reduce the size of word "oil"?

  library(wordcloud2)
  wordcloud2(data = demoFreq)
  demoFreq[1,2]<-8000
  wordcloud2(demoFreq)
    
    
  

Upvotes: 1

Views: 384

Answers (1)

Jon Spring
Jon Spring

Reputation: 66775

A log transform does a decent job here, but arguably (this is subjective) it does too much "flattening out" between different orders of magnitude.

Alternatively, you could raise the frequency to different powers in between 0 and 1 to see what works best for your data. To my eye, something around a cube root (like x^0.3) is a good balance between preserving the original scales and showing enough of the detail from less frequent items.

demoFreq$orig_freq = demoFreq$freq

# too even, perhaps
demoFreq$freq = log(demoFreq$orig_freq)
wordcloud2(demoFreq)

enter image description here

# maybe more like what you want -- oil doesn't overwhelm but still big
demoFreq$freq = (demoFreq$orig_freq)^0.3
wordcloud2(demoFreq)

enter image description here

Upvotes: 1

Related Questions