Reputation: 611
I want to get the following:-
Input
GET /1.1/friendships/list.json?user_id=123 HTTP/1.1
GET /1.1/friendships/list.json HTTP/1.1
GET /1.1/users/show.json?include_entities=1&user_id=321 HTTP/1.1
GET /1.1/friendships/list.json?user_id=234 HTTP/1.1
GET /1.1/friendships/create.json HTTP/1.1
Output
/1.1/friendships/list.json
/1.1/friendships/list.json
/1.1/users/show.json
/1.1/friendships/list.json
/1.1/friendships/create.json
I have been able to match till the question mark character. I want to match a character that is either a question mark or a blank space. Here is what I have so far.
([A-Z])+ (\S)+[\?]
Upvotes: 2
Views: 1436
Reputation: 43169
The following expression accepts GET
and POST
:
^(?:GET|POST)\s+([^?\n\r]+).*$
Broken down, this says
^ # start of line
(?:GET|POST)\s+ # GET or POST literally, at least one whitespace
([^?\s]+) # not a question mark nor whitespace characters
.* # 0+ chars afterwards
$ # end of line
This needs to be replaced by \1
, see a demo on regex101.com and mind the MULTILINE
flag.
Python
, this would be:
import re
string = """
GET /1.1/friendships/list.json?user_id=123 HTTP/1.1
GET /1.1/friendships/list.json HTTP/1.1
GET /1.1/users/show.json?include_entities=1&user_id=321 HTTP/1.1
GET /1.1/friendships/list.json?user_id=234 HTTP/1.1
GET /1.1/friendships/create.json HTTP/1.1
POST /some/other/url/here
"""
rx = re.compile(r'^(?:GET|POST)\s+([^?\s]+).*$', re.M)
matches = rx.findall(string)
print(matches)
# ['/1.1/friendships/list.json', '/1.1/friendships/list.json', '/1.1/users/show.json', '/1.1/friendships/list.json', '/1.1/friendships/create.json', '/some/other/url/here']
Upvotes: 1