Reputation: 115
Following code does not work if I don't uncomment the commented line.It says
type mismatch; found : Unit required: Option[String]
import java.io._
import org.apache.pdfbox.pdmodel.PDDocument
import org.apache.pdfbox.text.PDFTextStripper
object PdfToText {
def main(args: Array[String]) {
val filename = "D:\\Scala\\DATA\\HelloWorld.pdf"
getTextFromPdf(filename)
}
def getTextFromPdf(filename: String): Option[String] = {
val pdf = PDDocument.load(new File(filename))
println(new PDFTextStripper().getText(pdf))
// Some(new PDFTextStripper().getText(pdf))
}
}
Code executes fine if I keep the line -
Some(new PDFTextStripper().getText(pdf))
Output:
Welcome
to
The World of Scala
Could anyone please explain the behaviour of the line
" Some(new PDFTextStripper().getText(pdf)"
Upvotes: 0
Views: 230
Reputation: 30310
Option
is a Scala type that represents the presence or absence of a value. Instead of runtime constructs like null
and exceptions which have significant downsides, Option
(and equivalent constructs in other languages) allows for compile-time checking that you are handling both possibilities.
A common use of Option
is a database lookup by id. It is quite possible there is no entity with that id. The return type of the function in Scala would be Option[Employee]
. If you find one, it would return Some(employee)
; otherwise simply None
. Both subclass Option
. Note that you can think of Option
like a container of zero or one element.
In your case, you have defined your function to return Option[String]
, which is why returning Some(String)
containing the text of the file makes the compiler happy. I hope that answers your question.
Please note though that your function isn't really designed in a way that uses Option
effectively since there is no issue of presence/absence--just always present. In other words, returning an Option
isn't useful here. However, it would be perfectly appropriate to use Option
to represent the cases where the file is or isn't available on the file system to be read.
For example:
object PdfToText {
def main(args: Array[String]) {
val filename = "D:\\Scala\\DATA\\HelloWorld.pdf"
val text = getTextFromPdf(filename)
val result = text.map(t => s"File contents: ${t}").getOrElse("Empty file")
println(result)
}
def getTextFromPdf(filename: String): Option[String] = {
val file = new File(filename)
if (file.exists() && !file.isDirectory) {
val pdf = PDDocument.load(new File(filename))
Some(new PDFTextStripper().getText(pdf))
} else {
None
}
}
}
Here presence is defined by the existence of a readable file, which allows me to return its contents in a Some
, and absence is defined by the nonexistence of the file or the file being a directory, which manifests as a None
I then account for both possibilities in main
. If there's text to be read because the function gave me back a Some
, I call map
(which only fires on Some
s) to transform the text into a different string. If I get None
, we skip over to the getOrElse
and produce a default string.
Either way, we print out whatever we got, which is guaranteed to be a String
no matter what happened with the original File
.
As a shameless plug, you can learn more about Option
in our tutorial Nine Reasons to Try Scala. Just fast forward to 8:36.
Upvotes: 1
Reputation: 51271
The result of a method is the result of the final line of code. println
returns Unit
. If that's the last line then that's what the method returns, which conflicts with the stated Option[String]
return type.
The code new PDFTextStripper().getText(pdf)
returns a String
and wrapping it in Some()
makes it an Option[String]
which does match the stated method return type.
explanation
Here is a method that compiles.
def six(): Int = { // takes no arguments and returns an Int
println("top") // string sent to stdout
6 // this is the Int it returns
}
Here is a method that does not compile.
def six(): Int = { // takes no arguments and returns an Int
6 // this is the Int it should return
println("end") // Error: type mismatch
}
This method is supposed to return an Int
(that's what the : Int
means) but the last line is a println()
statement and println
returns a Unit
, not an Int
, so that causes the error. This method is trying to return Unit
when it is supposed to return Int
.
This is a very basic concept in the Scala language.
Upvotes: 2