sleepy whiskey
sleepy whiskey

Reputation: 21

scala - error: ')' expected but '(' found

I'm new to Scala and I cannot find out what is causing this error, I have searched similar topics but unfortunately, none of them worked for me. I've got a simple code to find the line from some README.md file with the most words in it. The code I wrote is:

    val readme = sc.textFile("/PATH/TO/README.md")
    readme.map(lambda line :len(line.split())).reduce(lambda a, b: a if (a > b) else b)

and the error is:

    Name: Compile Error
    Message: <console>:1: error: ')' expected but '(' found.
    readme.map(lambda line :len(line.split()) ).reduce( lambda a, b: a                 
    if (a > b) else b )        ^

    <console>:1: error: ';' expected but ')' found.
    readme.map(lambda line :len(line.split()) ).reduce( lambda a, b: a 
    if (a > b) else b )                       ^

Upvotes: 0

Views: 7113

Answers (1)

Mike Allen
Mike Allen

Reputation: 8279

Your code isn't valid Scala.

I think what you might be trying to do is to determine the largest number of words on a single line in a README file using Spark. Is that right? If so, then you likely want something like this:

val readme = sc.textFile("/PATH/TO/README.md")
readme.map(_.split(' ').length).reduce(Math.max)

That last line uses some argument abbreviations. This alternative version is equivalent, but a little more explicit:

readme.map(line => line.split(' ').length).reduce((a, b) => Math.max(a, b))

The map function converts an RDD of Strings (each line in the file) into an RDD of Ints (the number of words on a single line, delimited - in this particular case - by spaces). The reduce function then returns the largest value of its two arguments - which will ultimately result in a single Int value representing the largest number of elements on a single line of the file.

After re-reading your question, it seems that you might want to know the line with the most words, rather than how many words are present. That's a little trickier, but this should do the trick:

readme.map(line => (line.split(' ').length, line)).reduce((a, b) => if(a._1 > b._1) a else b)._2

Now map creates an RDD of a tuple of (Int, String), where the first value is the number of words on the line, and the second is the line itself. reduce then retains whichever of its two tuple arguments has the larger integer value (._1 refers to the first element of the tuple). Since the result is a tuple, we then use ._2 to retrieve the corresponding line (the second element of the tuple).

I'd recommend you read a good book on Scala, such as Programming in Scala, 3rd Edition, by Odersky, Spoon & Venners. There's also some tutorials and an overview of the language on the main Scala language site. Coursera also has some free Scala training courses that you might want to sign up for.

Upvotes: 4

Related Questions