Reputation: 357
I am trying to write a macro to automatically import a csv file my workbook. However I met with issue parsing the delimiters, specifically the readline method of the filesystemobject does not read a complete line when called
Example
a line in csv:
1,2,jack,"there can be many boys in the room"3,4,test,n.a
what readline extract
1,2,jack,"there can be many
This causes the sheet to end up like
1 | 2 | jack |there can be may
boys in the room| 3 | 4 | test | na
Any ideas what might be causing this issue?
Thanks
Upvotes: 2
Views: 1742
Reputation: 3710
The most proper way would be to handle and get rid of unprintable characters, as Pieter Geerkens recommends, or to read everything and split like PatricK recommends. But then, you may end up with having too much pieces again, if there are really some unexpected newlines.
Therefore here is a hint how to make your reading more robust on a, say, semantic level.
The idea is to read a line and to decide if it was a full line:
...
var line as String
var lineFinished as boolean
' Loop starts here
...
lineFinished = false
...
' Read a line, or a piece of it.
linePiece = a.ReadLine ' or similar
...
' Now let's count the number of quotas:
dim cnt as integer
cnt = 0
for i=1 to len(line)
if mid(line, 0, i) = """" then
cnt = cnt + 1
end if
next
' If there is an odd number of quotas, the line is not finished:
lineFinished = (cnt mod 2 = 0) and (cnt > 0)
' If the line is finished, then take it as a full line. Otherwise, add the pieces up.
if lineFinished then
lineFinal = linePiece
else
lineFinal = lineFinal & linePiece
end if
...
' Then, use this place to clean the line from other nasty chars:
line = replace(line, "\n", "")
line = replace(line, "\r", "")
line = replace(line, "\t", "")
...
' Then, put your lineFinal to the whole string and reset the variable for the next loop.
I am aware that replace and counting that way feels very clumsy. But somehow, this is VBA. Like this, you won't need the regex library and you can add up your experience directly to the code by adding lines. If you discover a new character that disturbs, just add it to the replace lines.
One may discuss if it would be better to check the final line for being finished, instead of check the pieces for being only parts of a line. But the one or other way, you may have some uncertainties if you read a very small piece without any quotas at all (therefore the cnt > 0
). But let's not hope your file is that poisoned ;-)
Good luck.
EDIT:
Maybe an even better approach, for the matter of counting, would be to count the number of commas ,
. So, you can rather precisely measure how "complete" your line is already.
Upvotes: 1