kdizzle
kdizzle

Reputation: 627

Formatting HTTP headers using regular expressions

I want to format my HTTP headers using regex. I've done it using split(' ') followed by array manipulation, however this time I want to perform this operation using regex.

I want to take this input which is a giant string:

GET /v1/search?q=bob%20dylan&type=artist HTTP/1.1
Host: api.spotify.com
Cache-Control: no-cache
Postman-Token: e2f09f98-f8e0-43f7-5f0e-b16e670399e2

and format it to be an object as so:

{ headers: 
   { Host: ' api.spotify.com',
     'Cache-Control': ' no-cache',
     'Postman-Token': ' e2f09f98-f8e0-43f7-5f0e-b16e670399e2' 
   },
  verb: 'GET',
  path: '/v1/search?q=bob%20dylan&type=artist',
  protocol: 'HTTP/1.1' 
}

I understand by using the split method, my code is more readable. However, my first attempt was to use regex since my goal was to extract/format a string.

I know it is possible through regex, but is it even worth it? What does everyone think?

Thank you for your time.

Upvotes: 3

Views: 12227

Answers (3)

Maciej Kozieja
Maciej Kozieja

Reputation: 1865

This should work for you:

const data = `GET /v1/search?q=bob%20dylan&type=artist HTTP/1.1
Host: api.spotify.com
Cache-Control: no-cache
Postman-Token: e2f09f98-f8e0-43f7-5f0e-b16e670399e2`

const format = data => {
    const headers = {}
    const result = { headers }
    const regex = /([\w-]+): (.*)/g
    let temp
    while (temp = regex.exec(data)) {
        headers[temp[1]] = temp[2]
    }
    temp = data.match(/(\w+)\s+(.*?)\s+(.*)/)
    result.verb = temp[1]
    result.path = temp[2]
    result.protocol = temp[3]
    return result
}

console.log(format(data))

/([\w-]+): (.*)/g this regex will match any header-name: value and capture it like so ['header-name: value', 'header-name', 'value']

then we asign it to headers object where header-name is key and value is value

at the end we parse first line to get rest of information

How it works

(\w+) match and capture 1 or more word characters
\s+ match 1 or more whitespace (.*?)match and capture any char not gready *?
\s+ until one or more white space is found
(.*) match evrything (until end of line)

Upvotes: 5

guest271314
guest271314

Reputation: 1

You can use .split() with RegExp \s/ where the first three elements of array returned by .split() should be verb, path, protocol; utilize .shift() on first three elements, with the remainder of results set as property, value pairs at headers object using current index and next index of array, until array .length evaluates to false at condition of while loop.

let getHeaders = headers => {

  let h = headers.split(/\s/);

  let o = {
    verb: h.shift(),
    path: h.shift(),
    protocol: h.shift(),
    headers: {}
  };

  while (h.length) {
    o.headers[h.shift()] = h.shift();
  }
  
  return o
};

var str = `GET /v1/search?q=bob%20dylan&type=artist HTTP/1.1
Host: api.spotify.com
Cache-Control: no-cache
Postman-Token: e2f09f98-f8e0-43f7-5f0e-b16e670399e2`;

console.log(getHeaders(str));

Upvotes: 2

Eduardo Lynch Araya
Eduardo Lynch Araya

Reputation: 824

This Should Work.

Search by:

(GET)\s(.+)\s(HTTP\/\d+\.\d+)\n(Host):\s(.+)$\n(Cache-Control):\s(.+)$\n(Postman-Token):\s(.+)$

Replace with:

{ headers:    \n\t{ $4 '$5',\n\t  '$6': '$7',\n\t  '$8': '$9'\n\t}, \n\tverb: '$1',\n\tpath: '$2',\n\tprotocol: '$3'\n}

JavaScript Code:

const regex = /(GET)\s(.+)\s(HTTP\/\d+\.\d+)\n(Host):\s(.+)$\n(Cache-Control):\s(.+)$\n(Postman-Token):\s(.+)$/gm;
const str = `GET /v1/search?q=bob%20dylan&type=artist HTTP/1.1
Host: api.spotify.com
Cache-Control: no-cache
Postman-Token: e2f09f98-f`;
const subst = `{ headers:    \n\t{ \$4 '\$5',\n\t  '\$6': '\$7',\n\t  '\$8': '\$9'\n\t}, \n\tverb: '\$1',\n\tpath: '\$2',\n\tprotocol: '\$3'\\n}`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log(result);

Input:

GET /v1/search?q=bob%20dylan&type=artist HTTP/1.1
Host: api.spotify.com
Cache-Control: no-cache
Postman-Token: e2f09f98-f

Output:

{ headers:    
    { Host 'api.spotify.com',
      'Cache-Control': 'no-cache',
      'Postman-Token': 'e2f09f98-f'
    }, 
    verb: 'GET',
    path: '/v1/search?q=bob%20dylan&type=artist',
    protocol: 'HTTP/1.1'
}

See: https://regex101.com/r/3DKEas/4

Upvotes: 0

Related Questions