Reputation: 10476
I need to import some CSVs into Excel spreadsheet, the row/column numbers of the CSVs are different for each of them. The problem is that some values are long digit strings like
341235387313289173719237217391
,
Excel will treat those values as (double) numbers, and then results in lost of data.
My way to solve it is to use the following vbafunction to do the job:
Sub readCSV(f As TextStream, sh As Worksheet)
i = 1
Do
l = Trim(f.ReadLine)
If l = "" Then Exit Sub 'skip the last empty line(s)
l = Mid(l, 2, Len(l) - 1)
ss = Split(l, """,""")
For j = LBound(ss) To UBound(ss) 'j starts from 0
Dim a As Range
With sh.Cells(i, j + 1)
.NumberFormat = "@" 'Force to text format
.Value = ss(j)
End With
DoEvents 'Avoid blocking the GUI
Next j
i = i + 1
Loop Until f.AtEndOfStream
End Sub
The problem is the performance. It is much slower than importing the data through Data->From Text or just open the CSVs directly.
Are there any way to do it more efficiently?
Upvotes: 2
Views: 713
Reputation: 166885
You can format/write each line in one shot:
Sub readCSV(f As TextStream, sh As Worksheet)
Dim i As Long
Dim ss, l
i = 1
With Application
.ScreenUpdating = False
.Calculation = xlCalculationManual
End With
Do
l = Trim(f.ReadLine)
If l = "" Then Exit Sub 'skip the last empty line(s)
l = Mid(l, 2, Len(l) - 1)
ss = Split(l, """,""")
With sh.Cells(i, 1).Resize(1, (UBound(ss) - LBound(ss)) + 1)
If (i-1) Mod 100 = 0 Then .Resize(100).NumberFormat = "@"
.Value = ss
End With
i = i + 1
Loop Until f.AtEndOfStream
With Application
.ScreenUpdating = True
.Calculation = xlCalculationAutomatic
End With
End Sub
EDIT: after testing, the real performance killer is setting the cell format to text- revised code to set this in blocks of 100 lines instead of each line.
Upvotes: 2
Reputation: 55702
Rather than work in Excel (by cell or by line) you could use a Regexp
to quikcly create a second version of your CSV file with everystring of alpanumerics longer than 16 characters updated with a preceding '
Then simply import or open the entire new csv in Excel
Sample code that runs on a CSV file StrIn
with this path for the example, "c:\Temp\test.csv"
Sub Main()
Dim objFSO As Object
Dim objRegex As Object
Dim objTF As Object
Dim objTF2 As Object
Dim tf As Object
Dim strIn As String
Dim strOut As String
Dim strFile As String
strIn = "c:\Temp\test.csv"
strOut = "c:\Temp\test2.csv"
Set objFSO = CreateObject("scripting.filesystemobject")
Set objTF = objFSO.getfile(strIn)
Set objRegex = CreateObject("vbscript.regexp")
Set tf = objTF.OpenAsTextStream(ForReading)
strFile = tf.ReadAll
With objRegex
.Pattern = "(\w{16,})"
.Global = True
strFile = .Replace(strFile, "'" & "$1")
End With
Set objTF2 = objFSO.OpenTextFile(strOut, ForWriting, True)
objTF2.Write strFile
objTF2.Close
tf.Close
End Sub
Upvotes: 1
Reputation: 5449
Try .Value = "'" & ss(j)
The '
forces the value to display as a text string in Excel.
Also, try declaring your ss array in a string so it doesnt store the numbers as longs after splitting. Something like:
Dim ss() as String = Split(l, """,""")
Upvotes: 0