Muhammed K K
Muhammed K K

Reputation: 1172

Parsing multi line comments from js using python

I want to get the contents of the multiline comments in a js file using python.

I tried this code sample

import re
code_m = """
/* This is a comment. */
"""
code_s = "/* This is a comment*/"

reg = re.compile("/\*(?P<contents>.*)\*/", re.DOTALL + re.M) 
matches_m = reg.match(code_m)
matches_s = reg.match(code_s)
print matches_s # Give a match object
print matches_m # Gives None

I get matches_m as None. But matches_s works. What am I missing here?

Upvotes: 2

Views: 691

Answers (2)

Blender
Blender

Reputation: 298166

re.match tests to see if the string matches the regex. You're probably looking for re.search:

>>> reg.search(code_m)
<_sre.SRE_Match object at 0x7f293e94d648>
>>> reg.search(code_m).groups()
(' This is a comment. ',)

Upvotes: 2

Andrew Clark
Andrew Clark

Reputation: 208475

match() only matches at the start of the string, use search() instead.

When using match(), it is like there is an implicit beginning of string anchor (\A) at the start of your regex.

As a side note, you don't need the re.M flag unless you are using ^ or $ in your regex and want them to match at the beginning and end of lines. You should also use a bitwise OR (re.S | re.M for example) instead of adding when combining multiple flags.

Upvotes: 4

Related Questions