Reputation: 2179
Let's say I have a corpus of documents which I want to read one by one and store them in a data structure. The structure will probably be a list of something. That something class will define a single document. Inside that class I'll have to use a data structure to store the contents from each document, what that should be? Also, if I want to count occurrences of words and retrieve the most frequent words in each document, will I have to use a data structure that will allow me to do this in time < O(n) that would take to examine all the contents sequentially?
Upvotes: 1
Views: 4441
Reputation: 64632
Use an associative array, also called map or dictionary since different programming languages use different terms for the same data structure.
Every entry key would be a word and the counter would be the value of the entry. For example
{
'on' -> 15,
'and' -> 43,
'I' -> 157,
'confluence' -> 1,
'dear' -> 2
}
Upvotes: 2