Nicoladastra
Nicoladastra

Reputation: 13

Split the content of cells in a column using regex

Here is the code that I had until now, which actually deletes everything in a cell but the numbers. This looked like this:

Sub myTest()

   Dim myCel As Range
   Dim i As Integer
    i = 0
'copier le contenu de la colonne A dans la colonne B

   Columns("A:A").Select
    Selection.Copy
    Columns("B:B").Select
    ActiveSheet.Paste

'on supprime tout ce qui n'est pas un nombre

   With CreateObject("VBScript.Regexp")
         .Global = True
         .Pattern = "\D+"
  For Each myCel In Range("B1:B900")
           myCel.Value = .Replace(myCel.Value, "")
  Next
  End With
End Sub

It worked alright... Now, what I want to do is actually to split the content of the cells into 3 cells: one with everything that's before the numbers, one with the numbers and one with what is after the number. I have got this kind of column:

<g0 t="bold">
</g0>
<g1>
</g1>
<g2>
</g2>
<g3>
</g3>
<g4>
</g4>
<i5 t="lb"/>
<i6 t="lb"/>
<g7>
</g7>
<g8>
</g8>
<i9 t="lb"/>
<i10 t="lb"/>
<i11 t="lb"/>

And I would like to get something like this: (in NP++ the regex would be (<.?)([0-9]{1,3})(.>) and then replace with \1\t\2\t\3. Well, in VBA it's more complex and I wanted to ask for help on this one... hope someone will have an idea :)

<g  0    t="bold">
</g 0   >
<g  1   >
</g 1   >
<g  2   >
</g 2   >
<g  3   >
</g 3   >
<g  4   >
</g 4   >
<i  5    t="lb"/>
<i  6    t="lb"/>
<g  7   >
</g 7   >
<g  8   >
</g 8   >
<i  9    t="lb"/>
<i  10   t="lb"/>
<i  11   t="lb"/>

Now my question is not so much about the regex itself (I am quite ok with regex) than the VBA code that allows me to really split the cell and send the 3 parts to 3 different cells in a loop...

Upvotes: 1

Views: 7523

Answers (2)

Gary&#39;s Student
Gary&#39;s Student

Reputation: 96753

Without Regex, select your cells and:

Sub parser101()
Dim r As Range, v As String, L As Long
Dim I As Long, N1 As Long, N2 As Long
N1 = 0
N2 = 0
For Each r In Selection
    v = r.Text
    L = Len(v)
    For I = 1 To L
        If IsNumeric(Mid(v, I, 1)) Then
            N1 = I
            GoTo escape101
        End If
    Next I
    '
escape101:
    '
    For I = N1 To L
        If Not IsNumeric(Mid(v, I, 1)) Then
            N2 = I
            GoTo escape102
        End If
    Next I
    '
escape102:
    '
    r.Offset(0, 1) = Mid(v, 1, N1 - 1)
    r.Offset(0, 2) = Mid(v, N1, N2 - N1)
    r.Offset(0, 3) = Mid(v, N2)
Next r
End Sub

Upvotes: 0

Automate This
Automate This

Reputation: 31364

You are pretty close with the Regex, I had to make a small modification.

First, make sure you add the reference to "Microsoft VBScript Regular Expressions 5.5" as to your VBA module (see this link for how to do that).

Private Sub TestRegex()
    Dim regEx As New RegExp
    Dim strPattern As String
    Dim strInput As String
    Dim strRaplace As String
    Dim strOutput As String
    Dim Myrange As Range

    Set Myrange = ActiveSheet.Range("B1:B19")

    For Each C In Myrange
        strPattern = "(\<.?.?)([0-9]{1,3})(.*>)"

        If strPattern <> "" Then
            strInput = C.Value
            strReplace = "$1"

            With regEx
                .Global = True
                .MultiLine = True
                .IgnoreCase = False
                .Pattern = strPattern
            End With

            If regEx.Test(strInput) Then
                C.Offset(0, 1) = regEx.Replace(strInput, "$1")
                C.Offset(0, 2) = regEx.Replace(strInput, "$2")
                C.Offset(0, 3) = regEx.Replace(strInput, "$3")
            Else
                C.Offset(0, 1) = "(Not matched)"
            End If
        End If
    Next
End Sub

My input is in column B and the output is shown in C, D & E

enter image description here

Upvotes: 3

Related Questions