Dang Khoa
Dang Khoa

Reputation: 5823

get file size of a file to wget before wget-ing it?

I'm wondering if there is a way to check ahead of time the size of a file I might download via wget? I know that using the --spider option tells me if a file exists or not, but I'm interested in finding the size of that file as well.

Upvotes: 73

Views: 57906

Answers (4)

Eugene Brevdo
Eugene Brevdo

Reputation: 899

This should work:

size_bytes=$(wget -S "${url}" --start-pos=500G 2>&1 | grep Content-Length | cut -d: -f2)

Upvotes: 0

haridsv
haridsv

Reputation: 9683

I was actually looking for the size of a directory and google got me here. While there is no direct answer here, the accepted answer helped me to build the following command on top of it:

wget --spider -m -np URL-to-dir 2>&1 | sed -n -e /unspecified/d -e '/^Length: /{s///;s/ .*//;p}' | paste -s -d+ | bc

The above runs wget in a spider mode for the entire directory, which ends up logging the length for each file in that directory. The output is then piped to sed to extract a sequence of numbers (byte sizes). The last two components in the pipe simply help sum it up to get the total in bytes.

Upvotes: 0

hmakholm left over Monica
hmakholm left over Monica

Reputation: 23342

Hmm.. for me --spider does display the size:

$ wget --spider http://henning.makholm.net/
Spider mode enabled. Check if remote file exists.
--2011-08-08 19:39:48--  http://henning.makholm.net/
Resolving henning.makholm.net (henning.makholm.net)... 85.81.19.235
Connecting to henning.makholm.net (henning.makholm.net)|85.81.19.235|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9535 (9.3K) [text/html]     <-------------------------
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

$ 

(But beware that not all web servers will inform clients of the length of the data except by closing the connection when it's all been sent.)

If you're concerned about wget changing the format it reports the length in, you might use wget --spider --server-response and look for a Content-Length header in the output.

Upvotes: 94

Keith Thompson
Keith Thompson

Reputation: 263497

curl --head URL

Look for "Content-Length:" in the output.

And thanks to Henning Makholm's comment:

wget --spider URL

and look for "Length:" in the output.

Upvotes: 42

Related Questions