Reputation: 25
I wrote a small program in C# to Capture ingame Text. My issue is that the Text allso containts Collor Codes which i try to not to have. I read about the function Regex.Replace Which i think is going to suite for that.
I have Following String (Line) i want to clear i used the small little tool espresso to play a little bit with regular expression but i never figured it really out.
This is the String i am going to work with:
|c001177ffSave Code =|r |cff00AA00A|cff00AA00G|cff00AA00Q|cffff69b4g|r |cff00AA00R|cff40e0d09|cffffff00$|cffffff00#|r |cff40e0d04|cffff69b4f|cff00AA00R
I try to use ^|( [a-zA-Z0-9]{9})
which gave me theese matches
c001177ff
cff00AA00
cff00AA00
cff00AA00
cffff69b4
cff00AA00
cff40e0d0
cffffff00
cffffff00
cff40e0d0
cffff69b4
cff00AA00
Well i am not good at regex more likly i just started it. I don't want any body to present me completed solution (you are more than welcome to do that) at least a little help how i can solve that issue. I want to filter the Text.
Inpute Code
|c001177ffSave Code =|r |cff00AA00A|cff00AA00G|cff00AA00Q|cffff69b4g|r |cff00AA00R|cff40e0d09|cffffff00$|cffffff00#|r |cff40e0d04|cffff69b4f|cff00AA00R
Should be Filtered to this
Save Code = AGQg R9$# 4fR
I think theese are Hexadecimal Color Codes the |c marks the beginning and the |r the End of the string.I think the |r | is just used to indicate that the first color string ends than we get an SPACE and the | indicates the next start.
Upvotes: 0
Views: 207
Reputation: 1680
This regex should match all of the characters you want to remove:
([|]c([0-9]|[a-f]|[A-F]){8})|[|]r
Here's the breakdown...
The vertical pipe is an OR marker, so to search for it, place it in square brackets [ and ].
The parenthesis makes a set. So you're searching for ([|]c([0-9]|[a-f]|[A-F]){8}) OR [|]r which is all of your color codes OR |r.
Breakdown of the color codes is the set that begins with |c and is followed by the set of exactly 8 characters that can be 0 though 9 or a through f or A through F.
I tested it at RegexPal.com.
Upvotes: 0
Reputation: 611
In addition to this answer re: escaping the "pipe" character, you're starting your regex with the caret (^
) character. This matches the beginning of a line.
A correct regex would be:
\|c[0-9a-zA-Z]{8}
Upvotes: 0
Reputation: 20163
You're on the right track. Your regex
^|( [a-zA-Z0-9]{9})
Both forces the match to be only at the start of your input string, due to the ^
start-of-line anchor, and the |
needs to be escaped, because unescaped, it's a special "or" operator, which completely changes the meaning of your regex.
In addition, the space after the |
is undesired, and the capture group is unnecessary, as you only want to eliminate this portion.
If you replace all instances of this
\|[a-zA-z0-9]{9}
with nothing (the empty string)
You will achieve most of your goal. Try it here: http://regex101.com/r/rF6yB6/1
But it seems you really want to eliminate not just nine characters after the pipe, but up through nine characters. So use the {1,9}
range quantifier instead:
\|[a-zA-z0-9]{1,9}
Try it: http://regex101.com/r/rF6yB6/2
This seems to achieve your goal exactly.
Please consider bookmarking the Stack Overflow Regular Expressions FAQ for future reference.
Upvotes: 1
Reputation: 99957
var str1 = "|c001177ffSave Code =|r |cff00AA00A|cff00AA00G|cff00AA00Q|cffff69b4g|r |cff00AA00R|cff40e0d09|cffffff00$|cffffff00#|r |cff40e0d04|cffff69b4f|cff00AA00R"
var str2 = Regex.Replace(str,@"\|(r|[a-zA-Z0-9]{9})","") //"Save Code = AGQg R9$# 4fR"
Upvotes: 0
Reputation: 15364
How about a simple Linq?
var output = String.Join("", input.Split('|')
.Select(s => s.Length != 10 ? ' ' : s.Last()))
.Trim();
Upvotes: 2
Reputation: 28107
So I think the problem you were having was not escaping your |
... the following regex works for me:
var replaced = Regex.Replace(intput, @"\|c[0-9a-zA-Z]{8}|\|r", "");
\|c[0-9a-zA-Z]{8}
- match starting with "|c"
and then any 8 letters or numbers|
- or\|r
- match "|r"
Upvotes: 1
Reputation: 9270
string input = "[The example input from your question]";
string output = input.Replace("|r", "");
while (output.Contains("|c"))
output = output.Remove(output.IndexOf("|c"), 10);
// output = "Save Code = AGQg R9$# 4fR"
I like this much more than using Regexes just because it's so much more clear to me.
Upvotes: 0