tunafish24
tunafish24

Reputation: 2469

Regular Expression to Extract all Function Declarations

I'm not good with regular expressions, so I need help with a regular expression that will extract all C function declarations from within a word doc. I've got the word doc loaded in memory and have read the text, so that's not an issue. Also, all functions starts with INTERNAL_ and obviously end with ); e.g.

INTERNAL_DisplayMessage ( param a, int b );

So basically, I need the regular expression that will extract entire function declaration from INTERNAL_ to ;. The return value is same across all APIs, so that's irrelevant.

Upvotes: 0

Views: 1120

Answers (2)

JotaBe
JotaBe

Reputation: 39004

You need to use this regex:

  (INTERNAL_[^ ]+?\s?\(.*?\);)

The outer parentheses make all the text of a function to get captured inside a group.

The function declaraton parentheses are escaped with backslash \( \), so that they are treated as literals, instead of groupings.

[^ ]\s? means any character which is not space, one or more times, follorwed by an optional space just before the opening parenthesis

.*? means any character, * any number of times (including o), as least as possible

As your functions declarations includes \n inside them, you need to create your regex using the RegexOptions.Singleline option as the second parameter of the Regex constructor:

Specifies single-line mode. Changes the meaning of the dot (.) so it matches every character (instead of every character except \n).

See doc at: RegexOptions Enumeration

A good place to chek regexes is this one:

www.regexplanet.com

it lets you change the language and set options. For the SinleLine option check the 'dot (.) matches every character instead of every character except newlines (Singleline)' option on that page.

Upvotes: 2

Craig W
Craig W

Reputation: 4540

Something as simple as (INTERNAL_.+?\);) should work. I highly recommend RegExr for these types of tasks.

Upvotes: 2

Related Questions