Alex J. S. Martin
Alex J. S. Martin

Reputation: 11

Split delimiter has to be set to "," but value returned may include a comma

I am attempting to run a program which should pull in data from a spreadsheet, split the data in to chunks, and then import it in to my table based on it's "Values" variant. The format of the data pulled in would look something like this:

"HL","Hecla Mining Company Mining Stock","NSM",12.52,8.69,14.07,6.18

The code for splitting lines, defining values, and assigning them to columns is currently written as follows:

Dim Resp As String: Resp = Http.ResponseText
    Dim Lines As Variant: Lines = Split(Resp, vbLf)
    Dim sLine As String
    Dim Values As Variant
    For i = 0 To UBound(Lines)
        sLine = Lines(i)
        If InStr(sLine, ",") > 0 Then
            Values = Split(sLine, ",")
            W.Cells(i + 2, 2).Value = Replace(Values(1), Chr(34), "")
            W.Cells(i + 2, 5).Value = Replace(Values(2), Chr(34), "")
            W.Cells(i + 2, 6).Value = Values(3)
            W.Cells(i + 2, 7).Value = Values(4)
            W.Cells(i + 2, 8).Value = Values(5)
            W.Cells(i + 2, 9).Value = Values(6)
            W.Cells(i + 2, 10).Value = Values(7)
            W.Cells(i + 2, 11).Value = Values(8)
            W.Cells(i + 2, 13).Value = Values(9)
        End If

The issue occurs in that some of the lines return a name that includes a comma within it such as this:

"CDE","Coeur Mining, Inc.","NSM",7.59,16.25,9.52,7.01

This is causing Values(2) = "Coeur Mining" and Value(3) = "Inc." as opposed to Values(2) = "Coeur Mining, Inc." and Values(3) = "NSM"

I have tried updating the code to the following:

Dim Resp As String: Resp = Http.ResponseText
    Dim Lines As Variant: Lines = Split(Resp, vbLf)
    Dim sLine As String
    Dim Values As Variant
    For i = 0 To UBound(Lines)
        sLine = Lines(i)
        If InStr(sLine, ",") > 0 Then
           ***If InStr(sLine, ",Inc.") Then
            sLine = Replace(sLine, ",inc.", "")
        End If***
            Values = Split(sLine, ",")
            W.Cells(i + 2, 2).Value = Replace(Values(1), Chr(34), "")
            W.Cells(i + 2, 5).Value = Replace(Values(2), Chr(34), "")
            W.Cells(i + 2, 6).Value = Values(3)
            W.Cells(i + 2, 7).Value = Values(4)
            W.Cells(i + 2, 8).Value = Values(5)
            W.Cells(i + 2, 9).Value = Values(6)
            W.Cells(i + 2, 10).Value = Values(7)
            W.Cells(i + 2, 11).Value = Values(8)
            W.Cells(i + 2, 13).Value = Values(9)
        End If

However it does not appear to be functioning even with the nested If statement looking for ",Inc." within the sLine string.

Is there a formatting issue that I am not getting? I attempted to use a regex function as well but I am very new to excel/VBA and was unable to get it to figure out how to format it properly.

The regex code that was suggested was given as follows:

Public Function splitLine(line As String) As String()

Dim regex As Object     
Set regex = CreateObject("vbscript.regexp")
regex.IgnoreCase = True
regex.Global = True
regex.Pattern = ",(?=([^" & Chr(34) & "]" & Chr(34) & "[^" & Chr(34) & "]" & Chr(34) & ")(?![^" & Chr(34) & "]" & Chr(34) & "))"
splitLine = Split(regex.Replace(line, ";"), ";") End Function
Values = splitLine(sLine)

Any help would be greatly appreciated, more information or a copy of the actual excel file could be provided upon request.

Upvotes: 1

Views: 327

Answers (2)

Ron Rosenfeld
Ron Rosenfeld

Reputation: 60224

Here is a regex-based SplitLine function that will return an array of strings. It will exclude the surrounding quotes from those entries that have it, and it will not split on the "included" commas:

Option Explicit

Public Function splitLine(line As String) As String()
  Dim regex As Object, matchcol As Object, match As Object
  Dim I As Long, S() As String

Set regex = CreateObject("vbscript.regexp")
With regex
    .Global = True
    .Pattern = """([^""\r\n]*)""|([^,\r\n]+)"
    If .test(line) = True Then
        Set matchcol = .Execute(line)
        ReDim S(0 To matchcol.Count - 1)
        I = 0

        'matches surrounded by quotes will be in 0
        'matches without quotes will be in 1
        For Each match In matchcol
            With match
                S(I) = .submatches(0) & .submatches(1)
            End With
            I = I + 1
        Next match
    End If
End With
splitLine = S
End Function

Upvotes: 0

user4039065
user4039065

Reputation:

It looks like you are going to have to process the string through a 'helper' function that mimics the 'quoted text' parameter of Text-to-Columns.

While inelegant (and likely easily improved upon), this works for your sample.

Option Explicit

Sub test()
    Dim str As String, var As Variant

    str = """CDE"",""Coeur Mining, Inc."",""NSM"",7.59,16.25,9.52,7.01"
    With Worksheets("Sheet1")
        Debug.Print str
        str = cleanQuotedCommas(str)
        var = Split(str, Chr(44))
        With .Cells(2, "B").Resize(1, UBound(var) + 1)
            .Value = var
            .Replace what:=ChrW(8203), replacement:=Chr(44), lookat:=xlPart
            .Replace what:=Chr(34), replacement:=vbNullString, lookat:=xlPart
            .Value = .Value2
        End With
    End With
End Sub

Function cleanQuotedCommas(str As String) As String
    Dim i As Long, j As Long, k As Long
    i = InStr(1, str, Chr(34), vbBinaryCompare)
    Do While CBool(i)
        j = InStr(i + 1, str, Chr(34), vbBinaryCompare)
        k = InStr(i + 1, str, Chr(44), vbBinaryCompare)
        If k > i And k < j Then
            str = Replace(str, Chr(44), ChrW(8203), i, 1, vbBinaryCompare)
        End If
        Debug.Print str
        i = InStr(j + 1, str, Chr(34), vbBinaryCompare)
    Loop
    cleanQuotedCommas = str
End Function

enter image description here

Note the right-alignment of the true numbers for the doubles and the left-alignment of the text.

Upvotes: 0

Related Questions