Shawn
Shawn

Reputation: 2366

vb.net xls to csv with quotes?

I have a xls file, or a csv without quotes, and using vb.net need to turn it into a csv with quotes around every cell. If I open the xls/csv without quotes in MS Access, set every column to text and then export it, its in the format I need. Is there an easier way? If not, how do I do replicate this in vb.net? Thanks.

Upvotes: 0

Views: 1939

Answers (3)

Public Class clsTest

Public Sub Test
Dim s as string = "C:\!Data\Test1.csv"

        Dim Contents As String = System.IO.File.ReadAllText(s)


        Dim aryLines As String() = Contents.Split(New String() { Environment.Newline }, StringSplitOptions.None)
        Dim aryParts() As String
        Dim aryHeader() As String
        Dim dt As System.Data.DataTable 
        For i As Integer = 0 To aryLines.Length - 1
            aryParts = SplitCSVLine(aryLines(i))
            If dt Is Nothing And aryHeader Is Nothing Then 
                aryHeader = CType(aryParts.Clone, String())
            ElseIf dt Is Nothing And aryHeader IsNot Nothing Then
                dt = DTFromStringArray(aryParts, 1000, "", aryHeader)
            Else
                DTAddStringArray(dt, aryParts)
            End If
        Next
        dt.dump
End Sub 

Public Shared Function SplitCSVLine(strCSVQuotedLine As String) As String()
        Dim aryLines As String() = strCSVQuotedLine.Split(New String() {Environment.NewLine}, StringSplitOptions.None)
        Dim aryParts As String() = Nothing
        For i As Integer = 0 To aryLines.Length - 1
            Dim regx As New Text.RegularExpressions.Regex(",(?=(?:[^\""]*\""[^\""]*\"")*(?![^\""]*\""))")
            aryParts = regx.Split(aryLines(i))
            For p As Integer = 0 To aryParts.Length - 1
                aryParts(p) = aryParts(p).Trim(" "c, """"c)
            Next
        Next
        Return aryParts
End Function

Public Shared Function DTFromStringArray(ByVal aryValues() As String, Optional ByVal intDefaultColumnWidth As Integer = 255, Optional ByVal strTableName As String = "tblArray", Optional ByVal aryColumnNames() As String = Nothing) As DataTable
        If String.IsNullOrWhiteSpace(strTableName) Then strTableName = "tblArray"
        Dim dt As DataTable = New DataTable(strTableName)
        Dim colNew(aryValues.GetUpperBound(0)) As DataColumn
        If aryColumnNames Is Nothing Then
            ReDim aryColumnNames(aryValues.Length)
        Else
            If aryColumnNames.GetUpperBound(0) < aryValues.GetUpperBound(0) Then
                ReDim Preserve aryColumnNames(aryValues.Length)
            End If
        End If
        For x As Integer = aryColumnNames.GetLowerBound(0) To aryColumnNames.GetUpperBound(0)
            If String.IsNullOrWhiteSpace(aryColumnNames(x)) Then
                aryColumnNames(x) = "Field" & x.ToString
            Else
                aryColumnNames(x) = aryColumnNames(x)
            End If
        Next
        For i As Integer = 0 To aryValues.GetUpperBound(0)
            colNew(i) = New DataColumn
            With colNew(i)
                .ColumnName = aryColumnNames(i) '"Value " & i
                .DataType = GetType(String)
                .AllowDBNull = False
                .DefaultValue = ""
                .MaxLength = intDefaultColumnWidth
                .Unique = False
            End With
        Next
        dt.Columns.AddRange(colNew)
        Dim pRow As DataRow = dt.NewRow
        For i As Integer = aryValues.GetLowerBound(0) To aryValues.GetUpperBound(0)
            pRow.Item(i) = aryValues(i)
        Next
        dt.Rows.Add(pRow)
        Return dt
End Function

Public Shared Sub DTAddStringArray(ByRef dt As DataTable, ByVal aryRowValues() As String)
        Dim pRow As DataRow
        pRow = dt.NewRow
        For i As Integer = aryRowValues.GetLowerBound(0) To aryRowValues.GetUpperBound(0)
            pRow.Item(i) = aryRowValues(i)
        Next
        dt.Rows.Add(pRow)
End Sub

End Class

Upvotes: -1

Ekkehard.Horner
Ekkehard.Horner

Reputation: 38755

If you use the .Net OLE DB provider, you can specify the .csv formatting details in a schema.ini file in the folder your data files live in. For the 'unquoted' .csv the specs should look like

[noquotes.csv]        <-- file name
ColNameHeader=True    <-- or False
CharacterSet=1252     <-- your encoding
Format=Delimited(,)   <-- 
TextDelimiter=        <-- important: no " in source file
Col1=VendorID Integer <-- your columns, of course
Col2=AccountNumber Char Width 15

for the 'quoted' .csv, just change the name and delete the TextDelimiter= line (put quotes around text fields is the default).

Then connect to the Text Database and execute the statement

SELECT * INTO [quotes.csv] FROM [noquotes.csv]

(as this creates quotes.csv, you may want to delete the file before each experimental run)

Added to deal with "Empty fields must be quoted"

This is a VBScript demo, but as the important things are the parameters for .GetString(), you'll can port it to VB easily:

    Dim sDir   : sDir       = resolvePath( "§LibDir§testdata\txt" )
    Dim sSrc   : sSrc       = "noquotes.csv"
    Dim sSQL   : sSQL       = "SELECT * FROM [" & sSrc & "]"
    Dim oTxtDb : Set oTxtDb = New cADBC.openDb( Array( "jettxt", sDir ) )
    WScript.Echo goFS.OpenTextFile( goFS.BuildPath( sDir, sSrc ) ).ReadAll()
    Dim sAll : sAll = oTxtDb.GetSelectFRO( sSQL ).GetString( _
                             adClipString, , """,""", """" & vbCrlf & """", "" _
                      )
    WScript.Echo   """" & Left( sAll, Len( sAll ) - 1 )

and output:

    VendorID;AccountNumber;SomethingElse
    1;ABC 123 QQQ;1,2
    2;IJK 654 ZZZ;2,3
    3;;3,4

    "1","ABC 123 QQQ","1,2"
    "2","IJK 654 ZZZ","2,3"
    "3","","3,4"

(german locale, therefore field separator ; and decimal symbol ,)

Same output from this VB.Net code:

    Imports ADODB
    ...

        Sub useGetString()
            Console.WriteLine("useGetString")

            Const adClipString As Integer = 2
            Dim cn As New ADODB.Connection
            Dim rs As ADODB.Recordset
            Dim sAll As String

            cn.ConnectionString = _
                 "Provider=Microsoft.Jet.OLEDB.4.0;" _
               & "Data Source=M:\lib\kurs0705\testdata\txt\;" _
               & "Extended Properties=""text;"""

            cn.Open()
            rs = cn.Execute("SELECT * FROM [noquotes.csv]") 
            sAll = rs.GetString( adClipString, , """,""", """" & vbCrLf & """", "" )
            cn.Close()
            sAll = """" & Left( sAll, Len( sAll ) - 1 )
            Console.WriteLine( sAll )
        End Sub

Upvotes: 2

nickmoriarty
nickmoriarty

Reputation: 1263

Check out the method at this link. What you can do to make sure quotes go around is append quotes to the beginning and end of each column data in the loop that is putting the column data in the file.

for example make the loop like this:

For InnerCount = 0 To ColumnCount - 1
    Str &= """" & DS.Tables(0).Rows(OuterCount).Item(InnerCount) & ""","
Next

Upvotes: 1

Related Questions