Reputation: 10132
I would like to match strings with a wildcard (*), where the wildcard means "any". For example:
*X = string must end with X
X* = string must start with X
*X* = string must contain X
Also, some compound uses such as:
*X*YZ* = string contains X and contains YZ
X*YZ*P = string starts with X, contains YZ and ends with P.
Is there a simple algorithm to do this? I'm unsure about using regex (though it is a possibility).
To clarify, the users will type in the above to a filter box (as simple a filter as possible), I don't want them to have to write regular expressions themselves. So something I can easily transform from the above notation would be good.
Upvotes: 108
Views: 173428
Reputation: 75
It is necessary to take into consideration, that Regex IsMatch gives true with XYZ, when checking match with Y*. To avoid it, I use "^" anchor
isMatch(str1, "^" + str2.Replace("*", ".*?"));
So, full code to solve your problem is
bool isMatchStr(string str1, string str2)
{
string s1 = str1.Replace("*", ".*?");
string s2 = str2.Replace("*", ".*?");
bool r1 = Regex.IsMatch(s1, "^" + s2);
bool r2 = Regex.IsMatch(s2, "^" + s1);
return r1 || r2;
}
Upvotes: 4
Reputation: 2958
This is kind of an improvement on the popular answer from @Dmitry Bychenko above (https://stackoverflow.com/a/30300521/4491768). In order to support ? and * as a matching characters we have to escape them. Use \\?
or \\*
to escape them.
Also a pre compiled regex will improve the performance (on reuse).
public class WildcardPattern
{
private readonly string _expression;
private readonly Regex _regex;
public WildcardPattern(string pattern)
{
if (string.IsNullOrEmpty(pattern)) throw new ArgumentNullException(nameof(pattern));
_expression = "^" + Regex.Escape(pattern)
.Replace("\\\\\\?","??").Replace("\\?", ".").Replace("??","\\?")
.Replace("\\\\\\*","**").Replace("\\*", ".*").Replace("**","\\*") + "$";
_regex = new Regex(_expression, RegexOptions.Compiled);
}
public bool IsMatch(string value)
{
return _regex.IsMatch(value);
}
}
usage
new WildcardPattern("Hello *\\**\\?").IsMatch("Hello W*rld?");
new WildcardPattern(@"Hello *\**\?").IsMatch("Hello W*rld?");
Upvotes: 3
Reputation: 988
For those using .NET Core 2.1+ or .NET 5+, you can use the FileSystemName.MatchesSimpleExpression method in the System.IO.Enumeration namespace.
string text = "X is a string with ZY in the middle and at the end is P";
bool isMatch = FileSystemName.MatchesSimpleExpression("X*ZY*P", text);
Both parameters are actually ReadOnlySpan<char>
but you can use string arguments too. There's also an overloaded method if you want to turn on/off case matching. It is case insensitive by default as Chris mentioned in the comments.
Upvotes: 42
Reputation: 186833
Often, wild cards operate with two type of jokers:
? - any character (one and only one)
* - any characters (zero or more)
so you can easily convert these rules into appropriate regular expression:
// If you want to implement both "*" and "?"
private static String WildCardToRegular(String value) {
return "^" + Regex.Escape(value).Replace("\\?", ".").Replace("\\*", ".*") + "$";
}
// If you want to implement "*" only
private static String WildCardToRegular(String value) {
return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$";
}
And then you can use Regex as usual:
String test = "Some Data X";
Boolean endsWithEx = Regex.IsMatch(test, WildCardToRegular("*X"));
Boolean startsWithS = Regex.IsMatch(test, WildCardToRegular("S*"));
Boolean containsD = Regex.IsMatch(test, WildCardToRegular("*D*"));
// Starts with S, ends with X, contains "me" and "a" (in that order)
Boolean complex = Regex.IsMatch(test, WildCardToRegular("S*me*a*X"));
Upvotes: 196
Reputation: 21
To support those one with C#+Excel (for partial known WS name) but not only - here's my code with wildcard (ddd*). Briefly: the code gets all WS names and if today's weekday(ddd) matches the first 3 letters of WS name (bool=true) then it turn it to string that gets extracted out of the loop.
using System;
using Microsoft.Office.Interop.Excel;
using System.Runtime.InteropServices;
using Range = Microsoft.Office.Interop.Excel.Range;
using System.Diagnostics;
using System.Reflection;
using System.IO;
using System.Text.RegularExpressions;
...
string weekDay = DateTime.Now.ToString("ddd*");
Workbook sourceWorkbook4 = xlApp.Workbooks.Open(LrsIdWorkbook, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
Workbook destinationWorkbook = xlApp.Workbooks.Open(masterWB, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
static String WildCardToRegular(String value)
{
return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$";
}
string wsName = null;
foreach (Worksheet works in sourceWorkbook4.Worksheets)
{
Boolean startsWithddd = Regex.IsMatch(works.Name, WildCardToRegular(weekDay + "*"));
if (startsWithddd == true)
{
wsName = works.Name.ToString();
}
}
Worksheet sourceWorksheet4 = (Worksheet)sourceWorkbook4.Worksheets.get_Item(wsName);
...
Upvotes: 0
Reputation: 460288
You could use the VB.NET Like-Operator:
string text = "x is not the same as X and yz not the same as YZ";
bool contains = LikeOperator.LikeString(text,"*X*YZ*", Microsoft.VisualBasic.CompareMethod.Binary);
Use CompareMethod.Text
if you want to ignore the case.
You need to add using Microsoft.VisualBasic.CompilerServices;
and add a reference to the Microsoft.VisualBasic.dll
.
Since it's part of the .NET framework and will always be, it's not a problem to use this class.
Upvotes: 38
Reputation: 1
public class Wildcard
{
private readonly string _pattern;
public Wildcard(string pattern)
{
_pattern = pattern;
}
public static bool Match(string value, string pattern)
{
int start = -1;
int end = -1;
return Match(value, pattern, ref start, ref end);
}
public static bool Match(string value, string pattern, char[] toLowerTable)
{
int start = -1;
int end = -1;
return Match(value, pattern, ref start, ref end, toLowerTable);
}
public static bool Match(string value, string pattern, ref int start, ref int end)
{
return new Wildcard(pattern).IsMatch(value, ref start, ref end);
}
public static bool Match(string value, string pattern, ref int start, ref int end, char[] toLowerTable)
{
return new Wildcard(pattern).IsMatch(value, ref start, ref end, toLowerTable);
}
public bool IsMatch(string str)
{
int start = -1;
int end = -1;
return IsMatch(str, ref start, ref end);
}
public bool IsMatch(string str, char[] toLowerTable)
{
int start = -1;
int end = -1;
return IsMatch(str, ref start, ref end, toLowerTable);
}
public bool IsMatch(string str, ref int start, ref int end)
{
if (_pattern.Length == 0) return false;
int pindex = 0;
int sindex = 0;
int pattern_len = _pattern.Length;
int str_len = str.Length;
start = -1;
while (true)
{
bool star = false;
if (_pattern[pindex] == '*')
{
star = true;
do
{
pindex++;
}
while (pindex < pattern_len && _pattern[pindex] == '*');
}
end = sindex;
int i;
while (true)
{
int si = 0;
bool breakLoops = false;
for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
{
si = sindex + i;
if (si == str_len)
{
return false;
}
if (str[si] == _pattern[pindex + i])
{
continue;
}
if (si == str_len)
{
return false;
}
if (_pattern[pindex + i] == '?' && str[si] != '.')
{
continue;
}
breakLoops = true;
break;
}
if (breakLoops)
{
if (!star)
{
return false;
}
sindex++;
if (si == str_len)
{
return false;
}
}
else
{
if (start == -1)
{
start = sindex;
}
if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
{
break;
}
if (sindex + i == str_len)
{
if (end <= start)
{
end = str_len;
}
return true;
}
if (i != 0 && _pattern[pindex + i - 1] == '*')
{
return true;
}
if (!star)
{
return false;
}
sindex++;
}
}
sindex += i;
pindex += i;
if (start == -1)
{
start = sindex;
}
}
}
public bool IsMatch(string str, ref int start, ref int end, char[] toLowerTable)
{
if (_pattern.Length == 0) return false;
int pindex = 0;
int sindex = 0;
int pattern_len = _pattern.Length;
int str_len = str.Length;
start = -1;
while (true)
{
bool star = false;
if (_pattern[pindex] == '*')
{
star = true;
do
{
pindex++;
}
while (pindex < pattern_len && _pattern[pindex] == '*');
}
end = sindex;
int i;
while (true)
{
int si = 0;
bool breakLoops = false;
for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
{
si = sindex + i;
if (si == str_len)
{
return false;
}
char c = toLowerTable[str[si]];
if (c == _pattern[pindex + i])
{
continue;
}
if (si == str_len)
{
return false;
}
if (_pattern[pindex + i] == '?' && c != '.')
{
continue;
}
breakLoops = true;
break;
}
if (breakLoops)
{
if (!star)
{
return false;
}
sindex++;
if (si == str_len)
{
return false;
}
}
else
{
if (start == -1)
{
start = sindex;
}
if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
{
break;
}
if (sindex + i == str_len)
{
if (end <= start)
{
end = str_len;
}
return true;
}
if (i != 0 && _pattern[pindex + i - 1] == '*')
{
return true;
}
if (!star)
{
return false;
}
sindex++;
continue;
}
}
sindex += i;
pindex += i;
if (start == -1)
{
start = sindex;
}
}
}
}
Upvotes: -1
Reputation: 7
C# Console application sample
Command line Sample:
C:/> App_Exe -Opy PythonFile.py 1 2 3
Console output:
Argument list: -Opy PythonFile.py 1 2 3
Found python filename: PythonFile.py
using System;
using System.Text.RegularExpressions; //Regex
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
string cmdLine = String.Join(" ", args);
bool bFileExtFlag = false;
int argIndex = 0;
Regex regex;
foreach (string s in args)
{
//Search for the 1st occurrence of the "*.py" pattern
regex = new Regex(@"(?s:.*)\056py", RegexOptions.IgnoreCase);
bFileExtFlag = regex.IsMatch(s);
if (bFileExtFlag == true)
break;
argIndex++;
};
Console.WriteLine("Argument list: " + cmdLine);
if (bFileExtFlag == true)
Console.WriteLine("Found python filename: " + args[argIndex]);
else
Console.WriteLine("Python file with extension <.py> not found!");
}
}
}
Upvotes: -4
Reputation: 2381
Using of WildcardPattern
from System.Management.Automation
may be an option.
pattern = new WildcardPattern(patternString);
pattern.IsMatch(stringToMatch);
Visual Studio UI may not allow you to add System.Management.Automation
assembly to References of your project. Feel free to add it manually, as described here.
Upvotes: 21
Reputation: 627468
A wildcard *
can be translated as .*
or .*?
regex pattern.
You might need to use a singleline mode to match newline symbols, and in this case, you can use (?s)
as part of the regex pattern.
You can set it for the whole or part of the pattern:
X* = > @"X(?s:.*)"
*X = > @"(?s:.*)X"
*X* = > @"(?s).*X.*"
*X*YZ* = > @"(?s).*X.*YZ.*"
X*YZ*P = > @"(?s:X.*YZ.*P)"
Upvotes: 7
Reputation: 174844
*X*YZ* = string contains X and contains YZ
@".*X.*YZ"
X*YZ*P = string starts with X, contains YZ and ends with P.
@"^X.*YZ.*P$"
Upvotes: 5