XGeek
XGeek

Reputation: 21

How to convert a comma delimited string to a two dimensional array

I have a csv file that is comma delimited with a Lf terminator and I want to convert this to a two dimensional array for further processing. I successfully read the file into a string and modified the line terminator from Lf to Cr as I understand vba does not recognize Lf as a terminator. I want to take this string and create a two dimensional array where I can also determine the number of rows and columns as the original csv file can be of arbitrary size. The csv file can also be large (>300 MB and >1M elements)

I haven't been able to figure out how to determine the number of columns even though there is a CR terminator in the string for each data line. The code below shows how I created the string which is comma delimited with Cr as the terminator

Open fName For Binary As #1

     Buf$ = String$(LOF(1), 0)
     Get 1, , Buf$
     Buf$ = Replace$(Buf$, vbLf, vbCr) ' Replace LF with CR

Close #1

I would like a method or function that will create a two dimensional array and calculate the number of rows and columns

Upvotes: 2

Views: 1273

Answers (2)

Robert Todar
Robert Todar

Reputation: 2145

I would attempt this by:

  1. First splitting the data into rows using vbNewLine.
  2. Then loop your split array, and split each iteration using your comma delimiter. Leaving you with a jagged array (an array containing arrays).
  3. Lastly, you now have the dimensions for the two dim array, ReDim and loop over it to add all your data.

For abstraction purposes, I separated this task into three separate functions.


Main function to Call

I made the delimiter optional so it's not restrictive to only commas.

This does not account for escaped characters, you'll have to add that if that is needed.

Private Function TextFileToArray(ByVal FilePath As String, Optional ByVal Delimiter As String = ",") As Variant

    'READ THE CONTENTS FROM TEXTFILE
    Dim FileContent As String
    FileContent = ReadTextFile(FilePath)

    'SEPERATE THE ROWS USING vbNewLine
    Dim SplitData As Variant
    SplitData = Split(FileContent, vbNewLine)

    'CREATE A JAGGED ARRAY BY SPLITTING EACH STRING
    Dim JaggedArray As Variant
    ReDim JaggedArray(LBound(SplitData, 1) To UBound(SplitData, 1))

    Dim Index As Long
    For Index = LBound(SplitData, 1) To UBound(SplitData, 1)
        JaggedArray(Index) = Split(SplitData(Index), Delimiter)
    Next Index

    'CONVERT JAGGED ARRAY INTO A TWO DIM ARRAY
    TextFileToArray = JaggedArrayToTwoDimArray(JaggedArray)

End Function

Read the content of the text file

This could be written in the main function, but it is normally best to try to separate your code into smaller blocks.

Public Function ReadTextFile(ByVal FilePath As String) As String

    Dim Fso As Object
    Set Fso = CreateObject("Scripting.FileSystemObject")

    Dim Ts As Object
    Set Ts = Fso.OpenTextFile(FilePath, 1, False)

    ReadTextFile = Ts.ReadAll

End Function

Convert Jagged Array into a 2d Array

I've stored the bounds in variables to make it easier to debug and read.

Private Function JaggedArrayToTwoDimArray(ByVal SourceArray As Variant) As Variant

    'CAPTURE BOUNDS
    Dim LB1 As Long
    LB1 = LBound(SourceArray, 1)

    Dim UB1 As Long
    UB1 = UBound(SourceArray, 1)

    Dim LB2 As Long
    LB2 = LBound(SourceArray(LB1), 1)

    Dim UB2
    UB2 = UBound(SourceArray(UB1), 1)

    'SET BOUNDS OF RETURN ARRAY
    Dim ReturnArray As Variant
    ReDim ReturnArray(LB1 To UB1, LB2 To UB2)

    'POPULATE TWO DIM ARRAY FROM JAGGED ARRAY
    Dim RowIndex As Long
    For RowIndex = LB1 To UB1

        Dim ColIndex As Long
        For ColIndex = LB2 To UB2
            ReturnArray(RowIndex, ColIndex) = SourceArray(RowIndex)(ColIndex)
        Next ColIndex

    Next RowIndex

    JaggedArrayToTwoDimArray = ReturnArray

End Function

Please feel free to add error handling and any changes that might be needed.

Upvotes: 1

Cyril
Cyril

Reputation: 6829

The first thing coming to mind is using InStr() which displays the location of the character in a string. Define your array dimensions beforehand (last row of your dataset); I will use i as iterator fro teh loop in the brief examples)

sep = InStr(Cells(i,1).Value,"BS") 'swap BS for whatever you want to separate on

Then use the separate (sep) to either array

arr(i,1) = left(Cells(i,1).value,sep-1)
arr(i,2) = right(Cells(i,1).value,len(cells(i,1))-sep+2) '2 characters in sep "b" and "s" so adding that back

The change I would suggest to your code is to use some symbol or string rather than a carriage return so it is easily separated.

Upvotes: 1

Related Questions