Cobra_Fast
Cobra_Fast

Reputation: 16111

Regex for Configuration files

I am planning a simple configuration file layout like

# commentary line
setting1 = some string
setting2 = 123
setting3=whatever

Now i want to write a regular expression (in C#.NET) that will be able to read that config file. My attempt is

!(\#)(.*)\s=\s(.*)

My goal is to

Is that correct or am I doing it wrong, if it's wrong, how would it be done?

Upvotes: 0

Views: 5438

Answers (4)

royas
royas

Reputation: 4936

Try sth like this: ^\s*(?:([^#\s=]+)\s*=\s*([^#]+)(#.)?)|(#.)$ If it does';t match line is invalid -> error in config file. 1st and 2nd matching group -> key -> value Chop whitespaces at the end of value 3rd end of line commnet 4th whole line commnet What with blak lines?

Upvotes: 0

Eugene
Eugene

Reputation: 11280

Try this:

^([a-zA-Z_]\w*)\s*=\s*([^#]+?)$

enter image description here

This works to names which may consist of letter, number and _ (and can't start with number). Flag 'm' (multiply lines) is required (for ^ and $ work)

P.S.I've tried on JS online regex evaluator - but I think regex are the same in C# (may be some changes you should do)

P.P.S. I see in C# you can give name to match groups in regular expression. I've found some C# code that make similar things (what you need - i think): (here is another regex - with other name-rules and without # comment support - You should change regex to yours)

    Regex re = new Regex(@"(?<key>^[a-zA-Z_][\w]*)\s*=\s*(?<value>\w+$)",
RegexOptions.IgnoreCase);

    Match match = re.Matches(str)[0];
    // get pair from line
    string option_name = match.Groups["key"].Value;
    string option_value = match.Groups["value"].Value;

Upvotes: 0

Joe White
Joe White

Reputation: 97858

If you insist on using a regular expression, this one should do the job:

^([^#][^\s=]+(?:\s+[^\s=]+))\s*=\s*(.*)

It won't match any line that starts with #. If it does match, the first matching group will be the name (without trailing spaces -- that's what the nested, non-capturing group is for; I think I correctly optimized it to prevent backtracking). The second matching group will be the value, without leading spaces.

I believe this should fit your criterion that names "basically can be anything"; it should match anything except trailing spaces and =. And it should match any value, including values that contain =. But I haven't tested this to confirm the edge cases, so make sure to write lots of unit tests to make sure it works correctly with a variety of inputs.

And, of course, be aware that this will be both slower, and more complicated, than just doing the string parsing directly.

Upvotes: 0

unholysampler
unholysampler

Reputation: 17341

Name value pairs are not that complicated. There is no need to bring regular expressions into this. All you need is a foreach loop for the lines in the file (which you have anyway). Then a simpile if statment checking that the line doesn't start with a comment indicator, followed by splitting the string based on equals. Regular expressions are cool, but sometimes they make things more complicated. Now you have two problems.

Upvotes: 2

Related Questions