How can I find words that occur frequently across several different texts?

Question

So, I'm trying to find words that crop up in a collection several texts. They don't necessarily have to be very frequent in any given text, or even across all the texts -- their frequency for any given text in the sample has to be roughly the same, that's all.

This seems fairly simple, but I haven't been able to find a clean and elegant way to do it -- the only idea that comes to mind is getting the frequency for all words in each given text (using something like this, say), turning those lists into dictionaries, and then getting every key where the range of values across all the dictionaries for that key is fairly low (like if the lowest value is within 25% of the highest, or whatever). That seems like it'd work, but it feels like such a kludge and this problem seems fairly common and banal, so I thought I'd ask if there's a better solution out there.

How can I find words that occur frequently across several different texts?

Answers (1)

Related Questions