Reputation: 205
I have a text file which has a data type encoding of EncodingUTF-8. All the data is successfully parsed and imported to the table if purely English characters. But a problem occurs if there's a mixed of Chinese characters in the field. How to read and parse the data successfully if there are mixed characters like Chinese character/s.
Below is a sample of the text tab delimited file which holds a Chinese character. During the debugging mode, the variable ls_unicode holds the value of the text file and the Chinese character is present,
And when the data is saved in the Table, this is the output:
The script below managed to get the Chinese characters and the DW update method returns success but when I've checked the value in the column, it shows "Globe MUX Project(?????:NA)" instead of Globe MUX Project(客户合同号:NA). I've also verified from debugging mode that the value Globe MUX Project(客户合同号:NA) is present. The DB column is also changed to NVarChar data type.
//#################################
li_FileNum = FileOpen(is_sourcepath, StreamMode!, Read!, LockWrite!)
ll_FileLength = FileLength(is_sourcepath)
eRet = FileEncoding(is_sourcepath)
IF eRet = EncodingANSI! and ll_filelength <= 32765 THEN
li_bytes = FileReadEx(li_FileNum, lbl_data)
ls_unicode = String(lbl_data, EncodingUTF8!)
dw_1.Reset( )
dw_1.ImportString(ls_unicode)
ls_sonum = String(dw_1.Object.shipmentOrderNum[1])
ls_chinesechar = String(dw_1.Object.contractnum[1])
sle_char.Text = String(dw_1.Object.contractnum[1])
dw_1.SetItem(1,'contractnum',ls_chinesechar)
dw_1.SetItem(1,'fname','TEST')
END IF
FileClose(li_FileNum)
IF dw_1.Update( ) = 0 THEN
Commit Using SQLCA;
END IF
//#################################
I've also made a test and did a manual SQL Insert statement and it successfully recorded the value 'Globe MUX Project(客户合同号:NA)' in the column. I think PB don't do this automatically if the column data type is NVarChar/NChar/or NText.
INSERT INTO SCH_HUAWEI_EDI_3B12RHDR ( COntractnum , FNAME )
VALUES ( N'Globe MUX Project(客户合同号:NA)' , 'TEST' )
Upvotes: 1
Views: 4252
Reputation: 205
I've found out that it should be manage in the column data type . I've changed the data type of the DB column from varchar to NVarChar, and update the table like this :
UPDATE SCH_HUAWEI_EDI_3B12RHDR SET contractnum = N'Globe MUX Project(客户合同号:NA)'
WHERE ShipmentOrderNum = 'DPH11309160073CC'
The expected result set is :
In the update statement, the set value was preceded with the capital letter N. What would be your recommendation on how to incorporate the mentioned update statement above since I'm using a datastore for updating? or the better question is, how to store Chinese characters using datastore in PowerBuilder?
Below is the PB Script:
IF (ids_edihdr.ImportFile(ls_SourcePath,1,1) = 1 ) AND (ids_edidtl.ImportFile(ls_SourcePath,2) > 0 ) THEN
//HEADER
IF ids_edihdr.RowCount() = 1 THEN
// Add script here to manage the mixed English and Chinese character values.
ids_edihdr.SetItem(1,'Fname',Upper(as_file))
ids_edihdr.SetItem(1,'CREATEDBY',Upper(SQLCA.LogID))
ids_edihdr.SetItem(1,'CREATEDDATE',idt_TranDate)
END IF
END IF
ids_edihdr.AcceptText()
ll_ret = ids_edihdr.Update()
IF ll_ret < 0 THEN GOTO ERR
Commit Using SQLCA;
ls_DestPath = is_ArchInboundPath + Upper(as_file)
FileCopy(ls_SourcePath,ls_DestPath)
FileDelete(ls_SourcePath)
GOTO DEST
ERR:
ROLLBACK Using SQLCA;
ls_ErrorPath = is_archerrorpath + Upper(as_file)
FileCopy(ls_SourcePath,ls_ErrorPath)
FileDelete(ls_SourcePath)
DEST:
Destroy ids_edihdr
Upvotes: 0
Reputation: 11465
Powerbuilder requires a BOM (Byte Order Mark) to be present at the begining of either utf-8 or utf-16 file to be correctly read, or to detect correctly the encoding with FileEncoding()
.
In your case, when looking at the file with an hex editor, the very first bytes must show EF BB BF
that is the ut-8 BOM.
Once the file has an utf-8 BOM, you should not have to convert the file content, PB will do it automagically. For a v10 and greater PB, all string data is internally converted and handled in utf-16.
BTW, in your proposed pbscript, you are closing the file twice.
Upvotes: 1