user360907
user360907

Reputation:

Determining the size of a webpage response using Scala

I have an assignment where I need to determine how much cache space will be required to store the contents of a webpage, and I have to do it all in Scala, which I'm in the process of learning. I know I can get the required information with a HTTP HEAD request, but from what I've read it seems I need an external library for that.

Is it possible to download the HTTP header without using an HTTP request and extract the required information using only Scala (no calls to Java code)?

Upvotes: 0

Views: 140

Answers (2)

user1706698
user1706698

Reputation:

If you need not use 3rd party libraries, then the solution might be to use Source.fromURL to get the page and then compute its size.

Hope this helps ;)

Upvotes: 1

Hbf
Hbf

Reputation: 3114

Without your restriction that only Scala may be used I would have said: use Async-Http-Client's AsyncHandler and stop as soon as onHeadersReceived has been called.

Without external libraries, you could try to mimic what a HTTP client is doing. Here's a sample telnet session:

$ telnet www.google.com 80
HEAD / Trying 173.194.40.20...
Connected to www.google.com.
Escape character is '^]'.
HEAD / HTTP/1.1
Host: www.google.com

HTTP/1.1 302 Found
Location: http://www.google.ch/
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=c2b92507b9088226:FF=0:TM=1361870408:LM=1361870408:S=mbY_Qws86Z75gPAk; expires=Thu, 26-Feb-2015 09:20:08 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=dAFEWKT5vk9HWP1sTF6Oo49jv0sRV7_49ewSgD3fYRiTjHqlUasKl7Jz86SnJhtS-o9zU9raxwCLhdfvEwdwl9imRwONMBTDBKDXtJhFufLCnAoOKgDQetv0A5FTN3Da; expires=Wed, 28-Aug-    2013 09:20:08 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Date: Tue, 26 Feb 2013 09:20:08 GMT
Server: gws
Content-Length: 218
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN

(What I typed was HEAD / HTTP/1.1, Host: www.google.com, and an additional return.)

You could try to use the JVM's Socket class to open a TCP connection to your server and send, as in the example above, the HEAD request yourself.

Upvotes: 0

Related Questions