Tom Enns
Tom Enns

Reputation: 182

Excel VBA Regex - how to match 3 characters when there may be other characters in between

I'm dealing with some data that has some odd symbols (from some other language)

Im writing a macro in excel that uses regular expressions, and I'd like to match "1x2" when the string contains 1â€xâ€2

Upvotes: 1

Views: 1125

Answers (3)

brettdj
brettdj

Reputation: 55682

Something like this will return the desired string components.

I have assumed that rather than 1 or 2 you actually wanted any numeric match. If not change "(\d).*?(x).*?(\d)" to "(1).*?(x).*?(2)"

Dim objRegex As Object
Dim strIn As String
strIn = "1â€xâ€2"
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(\d).*?(x).*?(\d)"
If .test(strIn) Then MsgBox .Replace(strIn, "$1$2$3")
End With

Upvotes: 1

Adi Inbar
Adi Inbar

Reputation: 12323

You can use \u#### in a regex to match a character by its Unicode number, where #### is the Unicode number in hex. The hex Unicode for â is 00E2, and the hex Unicode for € is 20AC, so this regex

1\u00E2\u20ACx\u00E2\u20AC2

will match 1â€xâ€2.

Is that what you were looking for? I'm not sure exactly what you mean by "match '1x2' when the string contains 1â€xâ€2". Are you saying that you want to match the sequence 1â€xâ€2 itself (which is what the regex above will do), or match 1x2 in strings that contain 1â€xâ€2 somewhere else (which is what your wording seems to imply, but I have the feeling that's not what you mean), or that you want to replace 1â€xâ€2 with 1x2?


EDIT: Actually, based on the question's title, where you say "there may be other characters in between", you'll want this regex, which makes those "other characters" optional:

1(\u00E2\u20AC)?x(\u00E2\u20AC)?2

This will match any of the following:

  • 1â€xâ€2
  • 1x2
  • 1â€x2
  • 1xâ€2

Upvotes: 1

Explosion Pills
Explosion Pills

Reputation: 191749

You could use 1.x.2. If there is no guarantee that there will be characters between them, 1.?x.?2 should work.

Upvotes: -1

Related Questions