user646265
user646265

Reputation: 1031

Regex Help (again)

I don't really know what to entitle this, but I need some help with regular expressions. Firstly, I want to clarify that I'm not trying to match HTML or XML, although it may look like it, it's not. The things below are part of a file format I use for a program I made to specify which details should be exported in that program. There is no hierarchy involved, just that each new line contains a 'tag':

<n>

This is matched with my program to find an enumeration, which tells my program to export the name value, anyway, I also have tags like this:

<adr:home>

This specifies the home address. I use the following regex:

<((?'TAG'.*):(?'SUBTAG'.*)?)?(\s+((\w+)=('|"")?(?'VALUE'.*[^'])('|"")?)?)?>

The problem is that the regex will split the adr:home tag fine, but fail to find the n tag because it lacks a colon, but when I add a ? or a *, it then doesn't split the adr:home and similar tags. Can anyone help? I'm sure it's only simple, it's just this is my first time at creating a regular expression. I'm working in C#, by the way.

Upvotes: 1

Views: 78

Answers (3)

TheCodeKing
TheCodeKing

Reputation: 19220

Not entirely sure what your aim is but try this:

(?><)(?'TAG'[^:\s>]*)(:(?'SUBTAG'[^\s>:]*))?(\s\w+=['"](?'VALUE'[^'"]*)['"])?(?>>)

I find this site extremely useful for testing C# regex expressions.

Upvotes: 1

Bob Vale
Bob Vale

Reputation: 18474

Will this help

<((?'TAG'.*?)(?::(?'SUBTAG'.*))?)?(\s+((\w+)=('|"")?(?'VALUE'.*[^'])('|"")?)?)?> 

I've wrapped the : capture into a non capturing group round subtag and made the tag capture non greedy

Upvotes: 1

SJuan76
SJuan76

Reputation: 24885

What if you put the colon as part of the second tag?

<((?'TAG'.*)(?':SUBTAG'.*)?)?(\s+((\w+)=('|"")?(?'VALUE'.*[^'])('|"")?)?)?>

Upvotes: 0

Related Questions