Moshe Hoadley
Moshe Hoadley

Reputation: 21

Cassandra 1.2.5 - invalid UTF8 bytes

I am reading and writing massive data into/from a CF.

After a while, I get the following error:

INFO [MemoryMeter:1] 2013-07-03 09:41:34,438 Memtable.java (line 238) CFS(Keyspace='amlear', ColumnFamily='tmp2_rpt_rptStats_popkeywrd_sp_G') liveRatio is 4.12192 (just-counted was 4.12192).  calculation took 168ms for 2048 columns
ERROR [ReadStage:706] 2013-07-03 09:41:56,187 CassandraDaemon.java (line 175) Exception in thread Thread[ReadStage:706,5,main]
java.lang.RuntimeException: org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 37464646464646464646464638333943c08074656c65666f6e6f73206170617261746f7320792061636365736f72696f73
    at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 37464646464646464646464638333943c08074656c65666f6e6f73206170617261746f7320792061636365736f72696f73
    at org.apache.cassandra.db.marshal.UTF8Type.getString(UTF8Type.java:54)
    at org.apache.cassandra.dht.AbstractBounds.format(AbstractBounds.java:103)
    at org.apache.cassandra.dht.AbstractBounds.getString(AbstractBounds.java:96)
    at org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator(ColumnFamilyStore.java:1387)
    at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1443)
    at org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
    at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1076)
    at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
    ... 3 more

NOTE, I recently upgraded from cassandra 1.1.4 to cassandra 1.2.5 (I don't know if it's relevant or not) java version: 1.6.0_32

Does anyone have any idea how to solve this?

Upvotes: 2

Views: 785

Answers (1)

Oliver Seiler
Oliver Seiler

Reputation: 116

Caused by: org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 37464646464646464646464638333943c08074656c65666f6e6f73206170617261746f7320792061636365736f72696f73

You have invalid UTF-8 bytes in the middle of this.

Specifically the 2-byte sequence c080 starting at the 17th byte is invalid. Not sure what character was intended, probably the NUL character (which should just be 00 in UTF-8). The first 2-byte sequence in UTF-8 is c280, corresponding to Unicode U+0080.

Broken UTF-8 encoder?

Upvotes: 1

Related Questions