Reputation: 182
I'm dealing with some data that has some odd symbols (from some other language)
Im writing a macro in excel that uses regular expressions, and I'd like to match "1x2" when the string contains 1â€xâ€2
Upvotes: 1
Views: 1125
Reputation: 55682
Something like this will return the desired string components.
I have assumed that rather than 1 or 2 you actually wanted any numeric match. If not change "(\d).*?(x).*?(\d)"
to "(1).*?(x).*?(2)"
Dim objRegex As Object
Dim strIn As String
strIn = "1â€xâ€2"
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(\d).*?(x).*?(\d)"
If .test(strIn) Then MsgBox .Replace(strIn, "$1$2$3")
End With
Upvotes: 1
Reputation: 12323
You can use \u#### in a regex to match a character by its Unicode number, where #### is the Unicode number in hex. The hex Unicode for â is 00E2, and the hex Unicode for € is 20AC, so this regex
1\u00E2\u20ACx\u00E2\u20AC2
will match 1â€xâ€2.
Is that what you were looking for? I'm not sure exactly what you mean by "match '1x2' when the string contains 1â€xâ€2". Are you saying that you want to match the sequence 1â€xâ€2 itself (which is what the regex above will do), or match 1x2 in strings that contain 1â€xâ€2 somewhere else (which is what your wording seems to imply, but I have the feeling that's not what you mean), or that you want to replace 1â€xâ€2 with 1x2?
EDIT: Actually, based on the question's title, where you say "there may be other characters in between", you'll want this regex, which makes those "other characters" optional:
1(\u00E2\u20AC)?x(\u00E2\u20AC)?2
This will match any of the following:
Upvotes: 1
Reputation: 191749
You could use 1.x.2
. If there is no guarantee that there will be characters between them, 1.?x.?2
should work.
Upvotes: -1