Reputation: 337
I’m afraid this is another noob question.
What I want to do is to use a Map
in order to count how often a word appears in a poe…m and then print the results to the console.
I went to the following code which I believe is working (while probably not quite idiomatic):
val poe_m="""Once upon a midnight dreary, while I pondered weak and weary,
|Over many a quaint and curious volume of forgotten lore,
|While I nodded, nearly napping, suddenly there came a tapping,
|As of some one gently rapping, rapping at my chamber door.
|`'Tis some visitor,' I muttered, `tapping at my chamber door -
|Only this, and nothing more.'"""
val separators=Array(' ',',','.','-','\n','\'','`')
var words=new collection.immutable.HashMap[String,Int]
for(word<-poe_m.stripMargin.split(separators) if(!word.isEmpty))
words=words+(word.toLowerCase -> (words.getOrElse(word.toLowerCase,0)+1))
words.foreach(entry=>println("Word : "+entry._1+" count : "+entry._2))
As far as I understand, in Scala, immutable data structures are preferred to mutable ones and val
preferable to var
so I’m facing a dilemma : words
should be a var
(allowing a new instance of map to be used for each iteration) if results are to be stored in an immutable Map
while turning words
into a val
implies to use a mutable Map
.
Could someone enlighten me about the proper way to deal with this existential problem?
Upvotes: 4
Views: 3929
Reputation: 7963
Credit lies elsewhere (Travis and Daniel in particular) for what follows but there was a simpler one liner needing to get out.
val words = poe_m split "\\W+" groupBy identity mapValues {_.size}
There's a simplification in that you won't need stripMargin because the regex, as suggested by Daniel disposes of the margin characters as well.
You could retain the _.isEmpty filtering to protect against the edge case for the empty String which yields ("" -> 1) if you want.
Upvotes: 1
Reputation: 369
This is how this is done in the very good book "Programming in Scala: A Comprehensive Step-by-Step Guide, 2nd Edition" by Martin Odersky:
def countWords(text: String) = {
val counts = mutable.Map.empty[String, Int]
for (rawWord <- text.split("[ ,!.]+")) {
val word = rawWord.toLowerCase
val oldCount =
if (counts.contains(word)) counts(word)
else 0
counts += (word -> (oldCount + 1))
}
counts
}
However, it also uses an mutable Map.
Upvotes: 1
Reputation: 8412
I am a noob with Scala too, so, there may be better ways to do it. I have come up with the following:
poe_m.stripMargin.split(separators)
.filter(x => !x.isEmpty)
.groupBy(x => x).foreach {
case(w,ws) => println(w + " " + ws.size)
}
By applying successive functions, you avoid the need for vars and mutables
Upvotes: 2
Reputation: 15345
Well, in functional programming it is preferred to use some immutable objects and to use functions to update them (for example a tail recursive function returning the updated map). However, if you are not dealing with heavy loads, you should prefer the mutable map to the use of var, not because it is more powerful (even if I think it should be) but because it is easier to use.
Finally the answer of Travis Brown is a solution for your concrete problem, mine is more a personal philosophy.
Upvotes: 2
Reputation: 139058
In this case you can use groupBy
and mapValues
:
val tokens = poe_m.stripMargin.split(separators).filterNot(_.isEmpty)
val words = tokens.groupBy(w => w).mapValues(_.size)
More generally this is a job for a fold:
val words = tokens.foldLeft(Map.empty[String, Int]) {
case (m, t) => m.updated(t, m.getOrElse(t, 0) + 1)
}
The Wikipedia entry on folds gives some good clarifying examples.
Upvotes: 10