Reputation: 64
I am trying to check if certain patterns exists between two other patterns across multiple lines. Namely in a SIP SDP I would like to know if 'a=recvonly','a=sendonly' or 'a=inactive' exists between two lines beginning with 'm=' or if there isn't a second 'm=' line then until the end of the string ($). For example between 'm=audio' and 'm=video' or if no other line beginning with 'm=' exists then until the end, which is an empty line at the bottom.
v=0\r$
o=- 1402066778 5 IN IP4 10.1.1.1\r$
c=IN IP4 10.1.1.1\r$
m=audio 2066 RTP/AVP 0 101\r$
a=rtpmap:0 PCMU/8000\r$
a=rtpmap:101 telephone-event/8000\r$
a=ptime:20\r$
a=inactive\r$
m=video 0 RTP/AVP 109 34\r$
a=inactive\r$
a=rtpmap:109 H264/90000\r$
a=fmtp:109 profile-level-id=42e01f\r$
$
There is a match here!
v=0\r$
o=- 1402066778 5 IN IP4 10.1.1.1\r$
c=IN IP4 10.1.1.1\r$
m=audio 2066 RTP/AVP 0 101\r$
a=rtpmap:0 PCMU/8000\r$
a=rtpmap:101 telephone-event/8000\r$
a=ptime:20\r$
m=video 0 RTP/AVP 109 34\r$
a=inactive\r$
a=rtpmap:109 H264/90000\r$
a=fmtp:109 profile-level-id=42e01f\r$
$
There is no match here
v=0\r$
o=- 1402066778 5 IN IP4 10.1.1.1\r$
c=IN IP4 10.130.93.210\r$
m=audio 2066 RTP/AVP 0 101\r$
a=rtpmap:0 PCMU/8000\r$
a=rtpmap:101 telephone-event/8000\r$
a=ptime:20\r$
a=recvonly\r$
$
There is a match here again
I thought the following should work because '|' is not greedy but it still finds the pattern in Example 2 where it should not since that appears below the m=video.
re1way = re.compile(r'm=audio.*?(a=recvonly|a=sendonly|a=inactive).*?[(^m=).*|(^$)]')
Where is the flaw in my idea please?
Upvotes: 1
Views: 179
Reputation: 161
I'm not quite sure based on your question exactly what the parameters are here. But given your examples and note that the end of a string is a possible endpoint, let's assume you want to determine whether one of the three "a=" instances you cite appear between the first "m=" and either "m="/end of string in a single string object (rather than identifying multiple instances in a single string object).
In this case, I might recommend the following utilizing the '|' special character in a two-tiered solution (this is for explanatory purposes but you get the idea). I'm sure you could craft a fairly complicated single-line search with some work, but in terms of readability I think this is easier:
a = re.search("m=(.*?)(m=|$)", example, re.DOTALL)
if bool(a) is True:
ares = a.group()
aresb = re.search("a=(recvonly|sendonly|inactive)", ares)
if bool(aresb) is True:
print("Yes, 'a=' substring found! Matching substring: " + aresb.group())
else:
print("No initial 'm=' found!")
I note that because the standard regular expressions module doesn't support variable length negative lookbehind assertion patterns, trying to use such methods to create a single line to parse for instances where 'm=' appears before the end of the string (e.g. Example 2) will not work. A multiline solution is best in my opinion.
Upvotes: 1