Python REGEX matching a multiline with carriage return

Question

I have the following data:

POST / HTTP/1.1
User-Agent: curl/7.27.0
Host: 127.0.0.1
Accept: */*
Content-Length: 55
Content-Type: application/x-www-form-urlencoded

id=1234&var=test&nextvar=hh%20hg&anothervar=BB55SSKKKkk

or

POST / HTTP/1.1

User-Agent: curl/7.27.0

Host: 127.0.0.1

Accept: */*

Content-Length: 55

Content-Type: application/x-www-form-urlencoded



id=1234&var=test&nextvar=hh%20hg&anothervar=BB55SSKKKkk

or

POST / HTTP/1.1^M
User-Agent: curl/7.27.0^M
Host: 127.0.0.1^M
Accept: */*^M
Content-Length: 55^M
Content-Type: application/x-www-form-urlencoded^M
^M
id=1234&var=test&nextvar=hh%20hg&anothervar=BB55SSKKKkk^M

how can I match the id=1234&var=test&nextvar=hh%20hg&anothervar=BB55SSKKKkk string only? I mean anything printable between two end of lines (or ^M) and next end of line (or ^M) I tried something like:

re.findall(r'^>([^
]+)[
]([a-zA-Z0-9=%&
]+)', buf, re.MULTILINE|re.DOTALL)

but no match. What am I doing wrong?

Jerry · Accepted Answer

I'm not sure why you have > at the beginning of your regex. This is what is preventing you from getting any matches at all. If you now remove it, there are a lot of matches which you do not seem to need.

I would suggest:

(?



Which ensures that you have only 2 consecutive newlines (either two 
, 
, or 
) before the line you're trying to match. The negative lookbehind (? is what enforces it (it fails the match if there's a newline/carriage return character before the two consecutive newlines).


The above regex doesn't really need the multiline and dotall flags, so you can drop them in this instance if you want to.

regex101 demo



EDIT: Since the 
, 
 and ^M are not metacharacters, I would suggest this:

(?


regex101 demo

Python REGEX matching a multiline with carriage return

Answers (2)

Related Questions