Alex V
Alex V

Reputation: 3644

How do I read a web page in Racket?

All of the information I can find online is about writing web servers, but there seems to be very little about functions useful for web clients. Ideally, I would like the function to look something like this:

(website "http://www.google.com")

And return a string containing the entire web page, but I would be happy with anything that works.

Upvotes: 8

Views: 1743

Answers (1)

John Clements
John Clements

Reputation: 17203

Here's a simple program that looks like it does what you want:

#lang racket

(require net/url)

(port->bytes
 (get-pure-port (string->url "http://www.google.com")))

If you're like me, you probably also want to parse it into an s-expression. Neil Van Dyke's neil/html-parsing does this:

#lang racket

(require (planet neil/html-parsing:2:0)
         net/url)

(html->xexp
 (get-pure-port (string->url "http://www.google.com")))

Note that since this program refers to a planet package, running this program for the first time will download and install the htmlprag package. Building the documentation could take quite a while. That's an one-time cost, though, and running the program again shouldn't take more than a few seconds.

EDIT: In 2023, this code still works fine, but PLaneT is not widely used at this point, and it would probably be more idiomatic at this point to suggest installing the html-parsing package using raco install html-parsing or with the File>>Package Manager... menu, and then running

#lang racket

(require html-parsing
         net/url)

(html->xexp
 (get-pure-port (string->url "http://www.google.com")))

Upvotes: 10

Related Questions