Reputation: 23
I have a program that needs to parse town names. Sometimes the user enters the correct town name but often the users enter the post code as the town name.
In case I cannot match the town name with a valid town name, I am assuming that the input contains the post code. The first 3 free standing characters of the post code uniquely identify the town.
Post codes have this format 3 letters followed by 3 digits, e.g. ABC123.
However some users enter the digits before the letters and some users combine the town name and the post code, e.g.
123ABC
Pretty city ABC123
How do I extract the first 3 free standing characters?
Free standing = to the left and right of the 3 characters are no other characters.
For the below strings ABC are the first 3 free standing characters.
ABC123
123ABC
ABC 123
123 ABC
123 ABC 456
ABC12DEF
123 ABC DEF
DE 123 ABC
Pretty city ABC123
These next strings do not have 3 free standing characters.
123ABCDEF
ABCD123
123ABCD
123 ABCD
Somename1234
1234Somename
Case is irrelevant.
Here are my attempts
Using regex. Does not work for "Pretty City ABC123"
Regex rgx = new Regex("[a-zA-Z]{3}");
string hamster = "ABC123";
var code = rgx.Match(hamster);
Awkward function
private static string GetCode(string pig)
{
var code = "";
var canstart = true;
for (int i = 0; i < pig.Length; i++)
{
//Console.WriteLine(code);
if (code.Length == 3)
{
if (char.IsLetter(pig[i]))
{
canstart = false;
code = "";
}
else
{
break;
}
}
if (char.IsLetter(pig[i]) && canstart)
{
code += pig[i];
}
else if (!char.IsLetter(pig[i]) && !canstart)
{
canstart = true;
}
}
if (code.Length != 3)
{
code = "";
}
return code;
}
Upvotes: 2
Views: 306
Reputation: 627607
You can use
(?<![a-zA-Z])[a-zA-Z]{3}(?![a-zA-Z])
See the regex demo. Details:
(?<![a-zA-Z])
- a negative lookbehind that matches a location that is not immediately preceded with an ASCII letter[a-zA-Z]{3}
- three ASCII letters(?![a-zA-Z])
- a negative lookahead that matches a location that is not immediately followed with an ASCII letter.In C#:
var rgx = new Regex(@"(?<![a-zA-Z])[a-zA-Z]{3}(?![a-zA-Z])");
var hamster = "ABC123";
var code = rgx.Match(hamster)?.Value;
Upvotes: 2