Chaity
Chaity

Reputation: 1388

org.apache.cassandra.serializers.MarshalException exception in cassandra 2.2.4

We are working on MySQL to Cassandra Data Migration. We dumped MySQL data into CSV format and converted it to Cassandra CSV and used encoding UTF-8. When we import Cassandra CSV it works fine for one cluster. But for the same CSV file it throws following exception in another cluster for same Cassandra Version (2.2.4)

message="org.apache.cassandra.serializers.MarshalException: Invalid UTF-8 bytes 56bc71d9">
Aborting import at record #5. Previously inserted records are still present, and some records after that may be present as well.

It shows exception for different records all the time. Record numbers and byte values are not consistent.

We used below command to import CSV file

copy <TABLE> FROM <FILE> with DELIMITER = '\t' AND NULL = 'NULL' AND QUOTE = '\"' AND ESCAPE = '\\';

We checked for some solutions but most of them have suggested 'ASSUME' command. As we are using Cassandra 2.2.4 we do not have Cassandra-Cli to check the command.

Is there any suggestion to check out the issue or any possible cases this issue may happen.

Upvotes: 1

Views: 546

Answers (1)

Ashraful Islam
Ashraful Islam

Reputation: 12840

There is a issue about it in cassandra lucene index 2.2.4.1, that i have submitted.
They already solved it.
Just update your lucene index code
The class com.stratio.cassandra.lucene.service.RegularCellsMapper
In the method Columns columns(ColumnFamily columnFamily)
Add the below code after for (Cell cell : columnFamily) {

if (!cell.isLive()) {
    continue;
}

Upvotes: 1

Related Questions