Reputation: 4163
it's probably sthg simple, here is what I tried :
Set objStream = CreateObject("ADODB.Stream")
Set objStreamNoBOM = CreateObject("ADODB.Stream")
With objStream
.Charset = "UTF-8"
.WriteText "aaaaaa"
.Position = 0
End With
With objStreamNoBOM
'.Charset = "Windows-1252" ' WORK
.Charset = "UTF-8" ' DOESN'T WORK!!
.Type = 2
.WriteText objStream.ReadText
.SaveToFile "toto.php", 2
End With
if the charset is UTF-8, then there is ï» at the beginning of the file.
Any idea on how to save a file with UTF-8 and without BOM?
Upvotes: 21
Views: 43285
Reputation: 38745
In the best of all possible worlds the Related list would contain a reference to this question which I found as the first hit for "vbscript bom vbscript".
Based on the second strategy from boost's answer:
Option Explicit
Const adSaveCreateNotExist = 1
Const adSaveCreateOverWrite = 2
Const adTypeBinary = 1
Const adTypeText = 2
Dim objStreamUTF8 : Set objStreamUTF8 = CreateObject("ADODB.Stream")
Dim objStreamUTF8NoBOM : Set objStreamUTF8NoBOM = CreateObject("ADODB.Stream")
With objStreamUTF8
.Charset = "UTF-8"
.WriteText "aÄö"
.Position = 0
.SaveToFile "toto.php", adSaveCreateOverWrite
.Type = adTypeText
.Position = 3
End With
With objStreamUTF8NoBOM
.Type = adTypeBinary
objStreamUTF8.CopyTo objStreamUTF8NoBOM
.SaveToFile "toto-nobom.php", adSaveCreateOverWrite
End With
Active code page: 65001
15.07.2015 18:48 5 toto-nobom.php
15.07.2015 18:48 8 toto.php
type toto-nobom.php
Upvotes: 40
Reputation: 4726
I knew that the Scripting File System Object's stream inserted a Byte Order Mark, but I haven't seen that with the ADODB Stream.
Or at least, not yet: I rarely use the ADODB stream object...
But I do remember putting this remark into some code a few years ago:
' **** WHY THIS IS COMMENTED OUT **** **** **** **** **** **** **** ****
' Microsoft ODBC and OLEDB database drivers cannot read the field names from
' the header when a unicode byte order mark (&HFF & &HFE) is inserted at the
' start of the text by Scripting.FileSystemObject 'Write' methods. Trying to
' work around this by writing byte arrays will fail; FSO 'Write' detects the
' string encoding automatically, and won't let you hack around it by writing
' the header as UTF-8 (or 'Narrow' string) and appending the rest as unicode
' (Yes, I tried some revolting hacks to get around it: don't *ever* do that)
' **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****
' With FSO.OpenTextFile(FilePath, ForWriting, True, TristateTrue)
' .Write Join(arrTemp1, EOROW)
' .Close
' End With ' textstream object from objFSO.OpenTextFile
' **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****
You can tell I was having a bad day.
Next, using prehistoric PUT commands from the days before file-handling had emerged from the primordial C:
' Put #hndFile, , StrConv(Join(arrTemp1, EOROW), vbUnicode)
' Put #hndFile, , Join(arrTemp1, EOROW)
' If you pass unicode, Wide or UTF-16 string variables to PUT, it prepends a
' Unicode Byte Order Mark to the data which, when written to your file, will
' render the field names illegible to Microsoft's JET ODBC and ACE-OLEDB SQL
' drivers (which can actually read unicode field names, if the helpful label
' isn't in the way). However, the 'PUT' statements writes a Byte array as-is
' **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****
So there's the code that actually does it:
Dim arrByte() As Byte
Dim strText As String
Dim hndFile As String
strText = "Y'all knew that strings are actually byte arrays?"
arrByte = strText
hndFile = FreeFile
Open FilePath For Binary As #hndFile
Put #hndFile, , arrByte
Close #hndFile
Erase arrByte
I'm assuming that strText is actually UTF-8. I mean, we're in VBA, in Microsoft Office, and we absolutely know that this is always going to be UTF-8, even we use it in a foreign country...
Upvotes: 4