7stud
7stud

Reputation: 48599

Why does curl repeat headers in the output?

Options I used:

-I, --head
      (HTTP/FTP/FILE) Fetch the HTTP-header only! HTTP-servers feature
      the  command  HEAD which this uses to get nothing but the header
      of a document. When used on an FTP or FILE file,  curl  displays
      the file size and last modification time only.

-L, --location
      (HTTP/HTTPS)  If the server reports that the requested page has moved to a different location (indi-
      cated with a Location: header and a 3XX response code), this option will make curl redo the  request
      on  the  new  place.  If  used together with -i, --include or -I, --head, headers from all requested
      pages will be shown. When authentication is used, curl only sends its  credentials  to  the  initial
      host. If a redirect takes curl to a different host, it won't be able to intercept the user+password.
      See also --location-trusted on how to change this. You can limit the amount of redirects  to  follow
      by using the --max-redirs option.

      When  curl  follows a redirect and the request is not a plain GET (for example POST or PUT), it will
      do the following request with a GET if the HTTP response was 301, 302, or 303. If the response  code
      was any other 3xx code, curl will re-send the following request using the same unmodified method.

      You  can tell curl to not change the non-GET request method to GET after a 30x response by using the
      dedicated options for that: --post301, --post302 and -post303.

-v, --verbose
      Be more verbose/talkative during the operation. Useful for debugging  and  seeing  what's  going  on
      "under the hood". A line starting with '>' means "header data" sent by curl, '<' means "header data"
      received by curl that is hidden in normal cases, and a line starting with '*' means additional  info
      provided by curl.

      Note  that  if  you  only  want HTTP headers in the output, -i, --include might be the option you're
      looking for.

      If you think this option still doesn't give you enough details, consider using --trace  or  --trace-
      ascii instead.

      This option overrides previous uses of --trace-ascii or --trace.

      Use -s, --silent to make curl quiet.

Below is the output that I'm wondering about. In the response containing the redirect(301), all the headers are displayed twice, but only one of the duplicates has the < in front of it. How am I supposed to interpret that?

$ curl -ILv http://www.mail.com

* Rebuilt URL to: http://www.mail.com/
*   Trying 74.208.122.4...
* Connected to www.mail.com (74.208.122.4) port 80 (#0)
> HEAD / HTTP/1.1
> Host: www.mail.com
> User-Agent: curl/7.43.0
> Accept: */*
> 
< HTTP/1.1 301 Moved Permanently
HTTP/1.1 301 Moved Permanently
< Date: Sun, 28 May 2017 22:02:16 GMT
Date: Sun, 28 May 2017 22:02:16 GMT
< Server: Apache
Server: Apache
< Location: https://www.mail.com/
Location: https://www.mail.com/
< Vary: Accept-Encoding
Vary: Accept-Encoding
< Connection: close
Connection: close
< Content-Type: text/html; charset=iso-8859-1
Content-Type: text/html; charset=iso-8859-1

< 
* Closing connection 0
* Issue another request to this URL: 'https://www.mail.com/'
*   Trying 74.208.122.4...
* Connected to www.mail.com (74.208.122.4) port 443 (#1)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384
* Server certificate: *.mail.com
* Server certificate: thawte SSL CA - G2
* Server certificate: thawte Primary Root CA
> HEAD / HTTP/1.1
> Host: www.mail.com
> User-Agent: curl/7.43.0
> Accept: */*
> 
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Date: Sun, 28 May 2017 22:02:16 GMT
Date: Sun, 28 May 2017 22:02:16 GMT
< Server: Apache
Server: Apache
< Vary: X-Forwarded-Proto,Host,Accept-Encoding
Vary: X-Forwarded-Proto,Host,Accept-Encoding
< Set-Cookie: cookieKID=kid%40autoref%40mail.com; Domain=.mail.com; Expires=Tue, 27-Jun-2017 22:02:16 GMT; Path=/
Set-Cookie: cookieKID=kid%40autoref%40mail.com; Domain=.mail.com; Expires=Tue, 27-Jun-2017 22:02:16 GMT; Path=/
< Set-Cookie: cookiePartner=kid%40autoref%40mail.com; Domain=.mail.com; Expires=Tue, 27-Jun-2017 22:02:16 GMT; Path=/
Set-Cookie: cookiePartner=kid%40autoref%40mail.com; Domain=.mail.com; Expires=Tue, 27-Jun-2017 22:02:16 GMT; Path=/
< Cache-Control: no-cache, no-store, must-revalidate
Cache-Control: no-cache, no-store, must-revalidate
< Pragma: no-cache
Pragma: no-cache
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Set-Cookie: JSESSIONID=F0BEF03C92839D69057FFB57C7FAA789; Path=/mailcom-webapp/; HttpOnly
Set-Cookie: JSESSIONID=F0BEF03C92839D69057FFB57C7FAA789; Path=/mailcom-webapp/; HttpOnly
< Content-Language: en-US
Content-Language: en-US
< Content-Length: 85237
Content-Length: 85237
< Connection: close
Connection: close
< Content-Type: text/html;charset=UTF-8
Content-Type: text/html;charset=UTF-8

< 
* Closing connection 1

Upvotes: 1

Views: 683

Answers (2)

user7365852
user7365852

Reputation:

Use:

curl -ILv http://www.mail.com 2>&1 | grep '^[<>\*].*$'

When cURL is called with the verbose command line flag, it sends the verbose output to stderr instead of stdout. The above command redirects stderr to stdout (2>&1), then we pipe the combined output to grep and use the above regex to only return the lines that begin with *, <, or >. All of the other lines in the output (including the dupes you were first concerned with) are removed from the output.

Upvotes: 1

hanshenrik
hanshenrik

Reputation: 21463

best guess: with -v you tell curl to be verbose (send debug info) to STDERR. with -I you tell curl to dump headers to STDOUT. and your shell, by default, combines STDOUT and STDERR. separate stdout and stderr, and you'll avoid the confusion.

curl -ILv http://www.mail.com >stdout.log 2>stderr.log ; cat stdout.log

Upvotes: 3

Related Questions