kassprek
kassprek

Reputation: 1007

Processing text file or string variable with foreign language characters

I would like to use VBA functions for example LCase$() and next UCase() with my test.xml file which is UTF-8 encoded. The sample code that is below loads the file with UTF-8 content:

Dim objFileSystem, objInputFile

Set objFileSystem = CreateObject("Scripting.fileSystemObject")
Set objInputFile = 
objFileSystem.OpenTextFile("c:\test.xml", 1)

inputData = objInputFile.ReadAll

objInputFile.Close

Now I try to convert contents to lowercase and next change the first letter to upper case

Var = inputData
Var = LCase$(Var)

Select Case Len(Var)

Case 0
CapitilizeFirstLetter = ""

Case 1
CapitilizeFirstLetter = UCase(Var)

Case Else
CapitilizeFirstLetter = Ucase(Left(Var, 1)) & mid(Var, 2)

End Select

I try to save file contents under the name test_edited.xml

FileUrl = "c:\test_edited.xml"

Set objStream = CreateObject("ADODB.Stream")
With objStream
    .Open
    .Charset = "utf-8" 'rosyjski iso-8859-5
    .Position = objStream.Size
    .WriteText=Var
    .Flush
    .Position = 0
    .Type     = 1 'binary
    .Read(3)      'skip BOM
    .SaveToFile FileUrl,2
    .Close
End With
Set objStream = Nothing

As a result, the content of the first file was:

Nejznámější ŽENY, MODELY, herečka, zpěvačka

And the second one is now

Nejznámější ŽENY, MODELY, herečka, zpěvačka

And I expected that it will look like that

Nejznámější ženy, modely, herečka, zpěvačka

What am I doing wrong?

I'm using Basic IDE ver 6.4.

The all code should be like below

Sub Main

'getting variable from outside
ChanNum = DDEInitiate("MacroEngine", "MacroGetVar")
Var$ = DDERequest$(ChanNum, "vChannelOpisA")
    DDETerminate ChanNum


Var = LCase$(Var) ' converting utf-8 encoded string to lower case

'change first letter to upper case
Select Case Len(Var)

Case 0
CapitilizeFirstLetter = ""

Case 1
CapitilizeFirstLetter = UCase(Var)

Case Else
CapitilizeFirstLetter = Ucase(Left(Var, 1)) & mid(Var, 2)

End Select

'sending variable to outside of vb script
ChanNum = DDEInitiate("MacroEngine","MacroSetVar")
Var = "vChannelOpisA=" + CapitilizeFirstLetter
DDEExecute (ChanNum, Var)
DDETerminate ChanNum

End Sub

The variable named Var should be utf-8 encoded finally to write them as a xml file. I can read a string from a file as well instead getting them with DDERequest.

Upvotes: 4

Views: 1952

Answers (2)

pascal b
pascal b

Reputation: 371

from my experience Vba utf-8, iso 8859-1 can be tricky as it depends on the file source editor and system environment if unix or windows or mac... Most text source editor or system use ANSI. I would advise you to try adodb as might read and renders utf-8 and the other one write better utf-8.

...
Set objStream = CreateObject("ADODB.Stream")
...
Dim ftxt As object
...

const bufFile = "c:\test.xml"
const stf = "c:\test_edited.xml"
Dim vData As Variant
Dim ftxt As TextStream
'ADODB
adoRead.Charset = "unicode"
adoRead.Open
adoRead.LoadFromFile bufFile
vData = Split(adoRead.ReadText, vbCrLf)
'ado object to write
Set fil = fso.GetFile(stf)
Set ftxt = fil.OpenAsTextStream(ForWriting, TristateUseDefault)

'process your data as intended
For j = LBound(vData) To UBound(vData)
   'code to capitalize...
   '...
   'write to
   ftxt.WriteLine vData(j)
Next j

This structure worked for me on French characters, I think it should be the same within utf-8 or unicode character set.

Cheers

Pascal

Upvotes: 1

kassprek
kassprek

Reputation: 1007

After 3 days my hard work and researchs I got it finally. My program for creating Macros is working under windows-1250 charset as you can see. So I must convert my string to utf-8 first and for the end convert back to windows-1250. The correct code that works for me is given below.

Sub Main

' Retriving variable from outside vba
ChanNum = DDEInitiate("MacroEngine", "MacroGetVar")
Var$ = DDERequest$(ChanNum, "vChannelOpisA")
DDETerminate ChanNum


Dim objStream As Object

' Converting string variable from Windows-1250 to utf-8
Set objStream = CreateObject("ADODB.Stream")
objStream.Open
objStream.Type     = 2 'Specify stream type text data.
objStream.Charset  = "WIndows-1250" 'Specify charset For the source text data.
objStream.WriteText Var
objStream.Position = 0
objStream.Charset  = "utf-8"
Var = objStream.ReadText
objStream.Close


' Processing the string characters to lower case and change first letter to upper case
Var = LCase$(Var)

Select Case Len(Var)
Case 0
CapitilizeFirstLetter = ""
Case 1
CapitilizeFirstLetter = UCase(Var)
Case Else
CapitilizeFirstLetter = Ucase(Left(Var, 1)) & mid(Var, 2)
End Select

' Converting edited string back to WIndows-1250.
Set objStream = CreateObject("ADODB.Stream")
objStream.Open
objStream.Type     = 2 'Specify stream type text data.
objStream.Charset  = "utf-8" 'Specify charset For the source text data.
objStream.WriteText CapitilizeFirstLetter
objStream.Position = 0
objStream.Charset  = "WIndows-1250"
CapitilizeFirstLetter = objStream.ReadText
objStream.Close

' Sending string variable to my Macro engine
ChanNum = DDEInitiate("MacroEngine","MacroSetVar")
Var = "vChannelOpisA=" + CapitilizeFirstLetter
DDEExecute (ChanNum, Var)
DDETerminate ChanNum

End Sub

And here an example if you want load data for variable from file.

Sub Main

sFileToEdit = "c:\test.xml"
sFileEdited = "c:\test_edited.xml"


Dim objStream As Object

Set objStream = CreateObject("ADODB.Stream")
objStream.Type = 2 'Specify stream type - we want To save text/string data.
objStream.Charset = "utf-8" 'Specify charset For the source text data.
objStream.Open 'Open the stream And write binary data To the object
objStream.LoadFromFile sFileToEdit
ReadFileData = objStream.ReadText
objStream.Close

ReadFileData = LCase$(ReadFileData)

Select Case Len(ReadFileData)
Case 0
CapitilizeFirstLetter = ""
Case 1
CapitilizeFirstLetter = UCase(ReadFileData)
Case Else
CapitilizeFirstLetter = Ucase(Left(ReadFileData, 1)) & mid(ReadFileData, 2)
End Select


Set objStream = CreateObject("ADODB.Stream")
objStream.Type = 2 'Specify stream type - we want To save text/string data.
objStream.Charset = "utf-8" 'Specify charset For the source text data.
objStream.Open 'Open the stream And write binary data To the object
objStream.WriteText CapitilizeFirstLetter
objStream.SaveToFile sFileEdited, 2 'Save binary data To disk
objStream.Close

End Sub

Upvotes: 3

Related Questions