Reputation: 1
I've been trying to find the regex in ruby to match a php comment block:
/**
* @file
* lorum ipsum
*
* @author ME <me@localhost>
* @version 00:00 00-00-0000
*/
Could anyone help I've tried searching alot and even though some regex I found has worked in a regex tester but doesn't when I write it in my ruby file.
This is the most successful bit of regex I have found:
(/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)
This is the output from my script
file is ./test/123.rb so regex is ((^\s*#\s)+(.*?))+
i = 0
found: my first ruby comment
file is ./test/abc.php so regex is (/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)
i = 0
found: *
i = 1
found: *
Here is the code I have to do this:
56 def self.extract_comments f
57 if @regex[File.extname(f)]
58 puts "file is " + f + " so regex is " + @regex[File.extname(f)]
59 cur_rgx = Regexp.new @regex[File.extname(f)]
60 matches = IO.read( f ).scan( cur_rgx )
61 content = ""
62 if ! matches.empty?
63 # content = "== " + f + " ==\n"
64 content += f + "\n"
65 for i in 0...f.length
66 content += "="
67 end
68 content += "\n"
69 for i in 0...matches.length
70 puts "i = " + i.to_s
71 puts "found: " + matches[i][2].to_s
72 content << matches[i][2].to_s + "\n"
73 end
74 content << "\n"
75 end
76 end
77 content || '' # return something
78 end
Upvotes: 0
Views: 165
Reputation: 1570
Unless it is important that each line inside the comment block begins with an asterisk, you may want to try this regex:
/\/\*(?:[^*]+|\*+(?!\/))*\*\//
EDIT: And here's a stricter version, which will only match comments that are formatted exactly like your example:
/^( *)\/\*\*\n(?:\1 \*(?:[^*\n]|\*(?!\/))*\n)+\1 \*\//
This version will only match a comment that has /**
and */
on separate lines. /**
can be indented by an arbitrary number of spaces (but no other white-space characters), but the other lines must be indented by exactly one more space than the /**
line.
EDIT 2: Here is another version:
/^([ \t]*)\/\*\*.*?\n(?:^\1 .*?\n)+^\1 \*\//
It allows a mixture of tabs and spaces (ew) for indentation, but still requires all lines to conform to the indentation of the /**
one (plus a single space).
Upvotes: 0
Reputation: 55002
It seems like /\/\*.*?\*\//m
should do.
Also that's really a c-style comment block.
Upvotes: 1