Md Sirajus Salayhin
Md Sirajus Salayhin

Reputation: 5144

Error in Pentaho Data Integrator - invalid byte sequence for encoding "UTF8": 0x00

Error getting while insert bulk rows with Pentaho Data Interrogator. I am using PostgreSQL

ERROR: invalid byte sequence for encoding "UTF8": 0x00 

Upvotes: 0

Views: 3531

Answers (3)

Md Sirajus Salayhin
Md Sirajus Salayhin

Reputation: 5144

Finally I got the solution:

  • In Table Input, check the "Enable lazy conversion" option
  • Enter the "Select Values" step Select all fields and on the forced "Metadata" tab by entering the "UTF-8" encoding for all fields.

Upvotes: 0

ChoCho
ChoCho

Reputation: 489

"UTF8": 0x00 = "null character". You can use "Modified Javascript" step, and then apply a mask pattern as follows:

function removeNull(e) {

if(e != null)
    return e.replace(/\0/g, '');
else
    return '';
}

var replacedString = removeNull(fieldToRemoveNullChars);

Select the new field for the Modified Javascript output, and voilla!. Use to have this problem with AS400 incoming data.

Upvotes: 2

Pavel Stehule
Pavel Stehule

Reputation: 45760

PostgreSQL is very strict content of text fields, and doesn't allow 0x00 in utf8 encoded fields. You should to fix your input data.

Some possible solution https://superuser.com/questions/287997/how-to-use-sed-to-remove-null-bytes

Upvotes: 0

Related Questions