fweigl
fweigl

Reputation: 22008

Match expressions in Strings

I have a database here with certain rules I need to apply to a a bunch of Strings, they're expressions that can occur within the Strings. They are expressed like

(word1 AND word2) OR (word3) 

I can't hardcode those (because they may be changed in the database), so I thought about programmatically turning those expressions into Regex patterns.

Has anybody done such a task yet or has an idea on how to do this the best way? I'm not wuite sure about how to deal with more complex expressions, how to take them apart and so on.

Edit: I'm using C# in VisualStudio / .NET.

The data is basically directory paths, a customer wants to get their documents organized, so the String I'm having are paths, the expressions in the DB could look like:

(office OR headquarter) AND (official OR confidential)

So if the file's directory path contains office and confidential, it should match.

Hope this makes it clearer.

EDIT2:

Heres some dummy examples:

The paths could look like:

c:\documents\official\johnmeyer\court\out\letter.doc
c:\documents\internal\appointments\court\in\september.doc
c:\documents\official\stevemiller\meeting\in\letter.doc

And the expressions like:

(meyer or miller) AND (court OR jail)

So this expression would match the 1st path/ file, but not the 2nd and 3rd one.

Upvotes: 0

Views: 114

Answers (1)

eFloh
eFloh

Reputation: 2158

No answer, but a good hint:

The expressions you have are actual trees constructed by the parentheses. You need a stack machine to parse the text into a (binary) tree structure, where each node is an AND or OR element and the leaves are the words. Afterwards, you can simply construct your regex in whatever language you need by walking the tree using depth first search and adding prefix and suffix data as needed before/after reading the subtree.

Consider an abstract class TreeNode having a method GenerateExpression(StringBuilder result). Each actual TreeNode item will be either an CombinationTreeNode (with a CombinationMode And/Or) or an SearchTextTreeNode (with an SearchText property).

GenerateExpression(StringBuilder result) for CombinationTreeNode will look similar like that:

result.Append("(");
rightSubTree.GenerateExpression(result);
result.Append(") " + this.CombinationMode.ToString() + " (");
rightSubTree.GenerateExpression(result);
result.Append(")");

GenerateExpression(StringBuilder result) for SearchTextTreeNode is much easier:

result.Append(this.SearchText);

Of course, your code will produce a regular expression instead of the input text, as mine does.

Upvotes: 1

Related Questions