Reputation: 7419
How can I check that a string matches a certain format? For example, how can I check that a string matches the format of an IP address, proxy address (or any custom format)?
I found this code but I am unable to understand what it does. Please help me understand the match string creation process.
string pattern = @"^([1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(\.
([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3}$";
//create our Regular Expression object
Upvotes: 1
Views: 11726
Reputation: 12225
Regex matching is made simple:
Regex r = new Regex(@"your_regexp");
if (r.Match(whatever).Success)
{
// Do_something
}
This code will invoke some actions if whatever
string matches you_regexp
regular expression.
So what are they, these regular expressions (the same as regex or regexp abbrevation)? They're nothing but string patterns, designed to use as filters for other strings.
Let's assume you have a lot of HTTP headers and you want to get GET moofoo HTTP/1.1
only. You may use string.Contains(other_string)
method, but regexps make this process more detailed, error-free, flexible and handy.
Regexp consists of blocks, which may be used for replacement in the future. Each block definnes which symbols the entire string can contain at some position. Blocks let you to define these symbols or use patterns to ease your work.
Symbols, which may be or may not be in the current string position are determined as follows:
HTTP
word - this is always present in HTTP headers.|
(logic OR
) operator. Note: all variants must be enclosed by block signs - round brackets. Read below for details. In our case this one matches GET
word - this header could use GET
, POST
, PUT
or DELETE
words.if you know all possible symbol ranges, use range blocks: for example, literals could be determined as [a-z]
, [\w]
or [[:alpha:]]
. Square brackets are the signs of range blocks. They must be used with count operator. This one is used to define repetitions. E.g. if your words/symbols should be matched once or more, you should define that using:
?
(means 'could be present and could be not')+
(stands for 'once or more')*
(stands for 'zero or more'){A,}
(stands for 'A or more'){A,B}
(means 'not less than A and not greater than B times'){,B}
(stands for 'not more than B')if you know which symbol ranges must not be present, use NOT
operator (^
) within range, at the very beginning: [^a-z]
is matched by 132==?
while [^\d]
is matched by abc==?
(\d
defines all digits and is equal to [0-9]
or [[:digit:]]
). Note: ^
is also used to determine the very beginning of entire string if it is not used within range block: ^moo
matches moofoo
and not foomoo
. To finish the idea, $
matches the very ending of entire string: moo$
would be matched with foomoo
and not moofoo
.
if you don't care which symbol to match, use star: .*
is the most commonly-used pattern to match any number of any symbols.
Note: all blocks should be enclosed by round brackets ((phrase)
is a good block example).
Note: all non-standard and reserved symbols (such as tab symbol \t
, round brackets (
and )
, etc.) should be escaped (e.g. used with back-slash before symbol representation: \(
, \t,
, \.
) if they do not belong to any block and should be matched as-is. For example, in our case there are two escape-sequences within HTTP/1.1
block: \/
and \.
. These two should be matched as-is.
Using all the text before i've typed for nearly 30 minutes, let's use it and create a regexp to match our example HTTP header:
(GET|POST|PUT|DELETE)
will match HTTP method
\
will match <SP>
symbol (space as it defined in HTTP specification)HTTP\/
would help us to math HTTP requests only(\d+\.\d+)
will match HTTP version (this will match not 1.1
only, but 12.34
too)^
and $
will be our string border-limitersGathering all these statements together will give us this regexp: ^(GET|POST|PUT|DELETE)\ HTTP\/(\d+\.\d+)$
.
Upvotes: 16
Reputation: 3147
Regular Expressions is what you use to perform a lookup on a string. A pattern is defined and you use this pattern to work out the matches for your expression. This is best seen by example.
Here is a sample set of code I wrote last year for checking if an entered string is a valid frequency of Hz, KHz, MHz, GHz or THz.
Understanding regular expressions will come from trial and error. Read up regular expressions documentation here - http://msdn.microsoft.com/en-us/library/2k3te2cs(v=vs.80).aspx The expression below took me about 6 hours to get working, due to misunderstanding what certain terms meant, and where I needed brackets etc. But once I had this one cracked the other 6 were very simple.
/// <summary>
/// Checks the given string against a regular expression to see
/// if it is a valid hertz measurement, which can be used
/// by this formatter.
/// </summary>
/// <param name="value">The string value to be tested</param>
/// <returns>Returns true, if it is a valid hertz value</returns>
private Boolean IsValidValue(String value)
{
//Regular Expression Explaination
//
//Start (^)
//Negitive numbers allowed (-?)
//At least 1 digit (\d+)
//Optional (. followed by at least 1 digit) ((\.\d+)?)
//Optional (optional whitespace + (any number of characters (\s?(([h].*)?([k].*)?([m].*)?([g].*)?([t].*)?)+)?
// of which must contain atleast one of the following letters (h,k,m,g,t))
// before the remainder of the string.
//End ($)
String expression = @"^-?\d+(\.\d+)?(\s?(([h].*)?([k].*)?([m].*)?([g].*)?([t].*)?)+)?$";
return Regex.IsMatch(value, expression, RegexOptions.IgnoreCase);
}
Upvotes: 2
Reputation: 54359
It looks like you are specifically looking for regular expressions which support IP addresses with port numbers. This thread may be useful; IPs with port numbers are discussed in detail, and there are some examples given:
http://www.codeproject.com/Messages/2829242/Re-Using-Regex-in-Csharp-for-ip-port-format.aspx
Keep in mind that a structurally valid IP is differently from a completely valid IP that only has valid numbers in it. For example, 999.999.999.999.:0000 has a valid structure, but it is not a valid IP address.
Alternatively, IPAddress.TryParse() may work for you, but I have not tried it myself.
http://msdn.microsoft.com/en-us/library/system.net.ipaddress.tryparse.aspx
Upvotes: 1