Zombie_Pigdragon
Zombie_Pigdragon

Reputation: 354

How to Match a Comma Seperated List and End with a Different Character

One project I am currently working on involves writing a parser in C#.

I chose to use Regex to extract the parts of each line. Only one problem... I have very little Regex experience.

My current issue is that I can't get argument lists to work. More specifically, I can't match comma separated lists. After two hours of being stuck, I've turned to SO.

My closest regex so far is:

(?:\s|^)(bool|int|string|float|void)\s+(\w+)\s*\(((?:bool|int|string|float)\s+\w+\s*)*\)

Obviously, the actual code part is not matched. Only the listed types are wanted.

I removed any and all comma detection code, as it all broke.

I want to make it match void FunctionName(int a, string b) or the equivalent with other spacing.


How can I make this happen?

Please suggest edits before voting to close, I'm bad at Stack Overflowing.

Upvotes: 0

Views: 74

Answers (2)

wp78de
wp78de

Reputation: 18950

Try it like this:

(?:\s|^)(bool|int|string|float|void)\s+(\w+)\s*\(((?:bool|int|string|float)\s+\w+(?(?=\s*,\s*\w)\s*,\s*|\s*))*\)

Demo

Explanation:

  • the crucial part here is the if-else regex a la (?(?=regex)then|else): (?(?=\s*,\s*\w)\s*,\s*|\s*)
    which means: if a type-param pair is followed by a comma assert another word character appears.

However, if feel using regex could turn out to be the wrong choice for your task at hand. There are some lightweight parser frameworks out, e.g. Sprache.

Upvotes: 1

Poul Bak
Poul Bak

Reputation: 10929

You're actually very close:

(?:\s|^)(bool|int|string|float|void)\s+(\w+)\s*\(((?:bool|int|string|float)\s+\w+,?\s*)*\)

The only difference is the ,? close to to end of the regex, which Means an optional comma and will match the comma between variables.

Upvotes: 0

Related Questions