spanky
spanky

Reputation: 153

How to save a excel file as csv with UTF-8 encoding?

I have been struggling for some time with the excel problem.I have Russian characters in my excel file (2003 version) and I want to save it to a csv.But the problem is whenever I do it instead of Russian characters I am getting ??????. After going through various forums I found out that the problem is with MS excel.

So Now I was wondering,if I could write a VB script which would read the strings from the required cells, covert them to UTF-8 encoding and store it in a text file. I have 2 problems here:

  1. Is my solution feasible or not? I have no idea if it can be done or not? Any pointers would be help full. I have trawled through net and couldn't find anything.

  2. As I am completely zero with VB scripting, can some body help me with a sample code please?

Most importantly, if someone knows a better way of doing it please let me know.

Thanks

Upvotes: 0

Views: 7718

Answers (2)

covarubias
covarubias

Reputation: 1

Open Office can do this! Try to load your Excel-file into Open Office and "save as" csv file. Then you will be asked to enter the encoding of the exported file. Select "Unicode (UTF-8)".

Upvotes: -1

Salman Arshad
Salman Arshad

Reputation: 272006

Here is a little vbscript that uses ADO to (i) read an excel file (ii) write to CSV file:

option explicit

' Reading Excel Files:
' http://support.microsoft.com/kb/257819

dim CONNECTION1
set CONNECTION1 = WScript.CreateObject("ADODB.CONNECTION")
CONNECTION1.Open "Provider=MICROSOFT.JET.OLEDB.4.0;Data Source=C:\Users\Salman\Desktop\input.xls;Extended Properties=""Excel 8.0;HDR=Yes;"""

dim RECORDSET1
set RECORDSET1 = WScript.CreateObject("ADODB.RECORDSET")
RECORDSET1.open "SELECT * FROM [Sheet1$]", CONNECTION1, 3, 1

' Notes:
' CharacterSet 65001 is UTF-8
' add/remove columns and change datatype to match you excel file

dim CONNECTION2
set CONNECTION2 = WScript.CreateObject("ADODB.CONNECTION")
CONNECTION2.Open "Provider=MICROSOFT.JET.OLEDB.4.0;Data Source=C:\Users\Salman\Desktop\;Extended Properties=""text;HDR=Yes;FMT=Delimited;CharacterSet=65001;"""
CONNECTION2.Execute "CREATE TABLE [output.csv] ([English] VARCHAR(200), [Swedish] VARCHAR(200), [Russian] VARCHAR(200), [Chinese Simplified] VARCHAR(200))"

dim RECORDSET2
set RECORDSET2 = WScript.CreateObject("ADODB.RECORDSET")
RECORDSET2.Open "SELECT * FROM [output.csv]", CONNECTION2, 2, 2

do until RECORDSET1.EOF
    RECORDSET2.AddNew
    dim i
    for i = 0 to RECORDSET1.Fields.Count - 1
        WScript.Echo RECORDSET1.Fields(i).Value
        RECORDSET2.Fields(i).Value = RECORDSET1.Fields(i).Value
    next
    RECORDSET2.Update
    RECORDSET1.MoveNext
loop

Example Excel File Content (Excel 2003):

English           Swedish            Russian              Chinese Simplified
this should work  Detta bör fungera  это должно работать  这应该工作
this should work  Detta bör fungera  это должно работать  这应该工作
this should work  Detta bör fungera  это должно работать  这应该工作
this should work  Detta bör fungera  это должно работать  这应该工作

Example CSV File Content (UTF-8 without BOM):

"English","Swedish","Russian","Chinese Simplified"
"this should work","Detta bör fungera","это должно работать","这应该工作"
"this should work","Detta bör fungera","это должно работать","这应该工作"
"this should work","Detta bör fungera","это должно работать","这应该工作"
"this should work","Detta bör fungera","это должно работать","这应该工作"

Run this script from command line using cscript:

C:\>cscript export.vbs

Note: you might encounter the following ADO error on 64-bit Windows:

ADODB.Connection: Provider cannot be found. It may not be properly installed. (Error code 0x800A0E7A)

The workaround for this problem is to run the script in 32-bit mode. You can do this by launching the 32-bit command prompt by entering this command in the Run dialog:

%WINDIR%\SysWOW64\cmd.exe

Execute the cscript command from this command prompt.

Upvotes: 3

Related Questions