Scala get html content from web page

Question

I am trying to get a whole html from a website in scala and then parse or get out certain information out of it. The standard html library doesnt work for me as if I am trying to print the html content it doesnt print the whole html? Any solutions how to get a full html content from a web page?

Som Bhattacharyya · Accepted Answer

Well you could use the excellent scala-scraper library here : Its basically a wrapper for the JSoup Java library
You could write code that reads like this :(taken from GitHub)

object NewsApp extends App {
  val browser = JsoupBrowser()
  val doc = browser.get("http://observador.pt")

  println()
  println("=== OBSERVADOR ===")

  doc >> extractor(".logo img", attr("src")) |> println
  doc >> extractorAt[String]("example-extractor") |> println

  println("==================")
  println()

  doc >> ".small-news-list h4 > a" foreach println
}

Scala get html content from web page

Answers (2)

Related Questions