RedHat
RedHat

Reputation: 205

How to read/convert EncodingUTF8 to EncodingANSI in PowerBuilder?

I have a text file which has a data type encoding of EncodingUTF-8. All the data is successfully parsed and imported to the table if purely English characters. But a problem occurs if there's a mixed of Chinese characters in the field. How to read and parse the data successfully if there are mixed characters like Chinese character/s.

Below is a sample of the text tab delimited file which holds a Chinese character. During the debugging mode, the variable ls_unicode holds the value of the text file and the Chinese character is present,

enter image description here

And when the data is saved in the Table, this is the output:

enter image description here

The script below managed to get the Chinese characters and the DW update method returns success but when I've checked the value in the column, it shows "Globe MUX Project(?????:NA)" instead of Globe MUX Project(客户合同号:NA). I've also verified from debugging mode that the value Globe MUX Project(客户合同号:NA) is present. The DB column is also changed to NVarChar data type.

 //#################################
li_FileNum = FileOpen(is_sourcepath, StreamMode!, Read!, LockWrite!)
ll_FileLength = FileLength(is_sourcepath)
eRet = FileEncoding(is_sourcepath)
IF eRet = EncodingANSI! and ll_filelength <= 32765 THEN 
    li_bytes = FileReadEx(li_FileNum, lbl_data)     
    ls_unicode = String(lbl_data, EncodingUTF8!)    

    dw_1.Reset( )
    dw_1.ImportString(ls_unicode)
    ls_sonum = String(dw_1.Object.shipmentOrderNum[1])
    ls_chinesechar = String(dw_1.Object.contractnum[1])
    sle_char.Text = String(dw_1.Object.contractnum[1])
    dw_1.SetItem(1,'contractnum',ls_chinesechar)
    dw_1.SetItem(1,'fname','TEST')
END IF
FileClose(li_FileNum)


IF dw_1.Update( ) = 0 THEN 
    Commit Using SQLCA;
END IF
//#################################

I've also made a test and did a manual SQL Insert statement and it successfully recorded the value 'Globe MUX Project(客户合同号:NA)' in the column. I think PB don't do this automatically if the column data type is NVarChar/NChar/or NText.

INSERT INTO SCH_HUAWEI_EDI_3B12RHDR (  COntractnum , FNAME ) 
VALUES ( N'Globe MUX Project(客户合同号:NA)' , 'TEST' ) 

Upvotes: 1

Views: 4252

Answers (2)

RedHat
RedHat

Reputation: 205

I've found out that it should be manage in the column data type . I've changed the data type of the DB column from varchar to NVarChar, and update the table like this :

UPDATE SCH_HUAWEI_EDI_3B12RHDR SET contractnum = N'Globe MUX Project(客户合同号:NA)' 
WHERE ShipmentOrderNum = 'DPH11309160073CC'

The expected result set is :

enter image description here

In the update statement, the set value was preceded with the capital letter N. What would be your recommendation on how to incorporate the mentioned update statement above since I'm using a datastore for updating? or the better question is, how to store Chinese characters using datastore in PowerBuilder?

Below is the PB Script:

IF (ids_edihdr.ImportFile(ls_SourcePath,1,1) = 1 ) AND (ids_edidtl.ImportFile(ls_SourcePath,2) > 0 ) THEN 
    //HEADER
    IF ids_edihdr.RowCount() = 1 THEN 

                // Add script here to manage the mixed English and Chinese character values.

        ids_edihdr.SetItem(1,'Fname',Upper(as_file))
        ids_edihdr.SetItem(1,'CREATEDBY',Upper(SQLCA.LogID))    
        ids_edihdr.SetItem(1,'CREATEDDATE',idt_TranDate)    
    END IF
END IF

ids_edihdr.AcceptText()

ll_ret = ids_edihdr.Update()
IF ll_ret < 0 THEN GOTO ERR

Commit Using SQLCA; 
ls_DestPath = is_ArchInboundPath + Upper(as_file)
FileCopy(ls_SourcePath,ls_DestPath)
FileDelete(ls_SourcePath)               
GOTO DEST

ERR:
ROLLBACK Using SQLCA;
ls_ErrorPath = is_archerrorpath + Upper(as_file)
FileCopy(ls_SourcePath,ls_ErrorPath)
FileDelete(ls_SourcePath)

DEST:
Destroy ids_edihdr

Upvotes: 0

Seki
Seki

Reputation: 11465

Powerbuilder requires a BOM (Byte Order Mark) to be present at the begining of either utf-8 or utf-16 file to be correctly read, or to detect correctly the encoding with FileEncoding().

In your case, when looking at the file with an hex editor, the very first bytes must show EF BB BF that is the ut-8 BOM.

Once the file has an utf-8 BOM, you should not have to convert the file content, PB will do it automagically. For a v10 and greater PB, all string data is internally converted and handled in utf-16.

BTW, in your proposed pbscript, you are closing the file twice.

Upvotes: 1

Related Questions