Reputation: 69
Up to this point all was well. We had a BSOD on a machine and now have corrupt SSTables. We are trying to find the correct procedure to get this node online. I would just love to kill the data and repair the node as we have replication 2 but I cant do that due to the amount of data on each node.
Attached is the error.
I tried to run nodetool scrub but since DSE cannot start, I get the normal cannot connect to 127.0.0.1 error.
Should I edit the config and change from policy stop to best effort then start/run the command?
Thanks,
ERROR 20:58:34 Exiting forcefully due to file system exception on startup, disk failure policy "stop" org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException at org.apache.cassandra.io.compress.CompressionMetadata.(CompressionMetadata.java:131) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:169) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:741) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:692) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:480) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:376) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:523) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_66] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_66] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] Caused by: java.io.EOFException: null at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.8.0_66] at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.8.0_66] at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.8.0_66] at org.apache.cassandra.io.compress.CompressionMetadata.(CompressionMetadata.java:106) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] ... 14 common frames omitted ERROR 20:58:34 Exiting forcefully due to file system exception on startup, disk failure policy "stop" org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException at org.apache.cassandra.io.compress.CompressionMetadata.(CompressionMetadata.java:131) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:169) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:741) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:692) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:480) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:376) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:523) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_66] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_66] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] Caused by: java.io.EOFException: null at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.8.0_66] at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.8.0_66] at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.8.0_66] at org.apache.cassandra.io.compress.CompressionMetadata.(CompressionMetadata.java:106) ~[cassandra-all-2.1.11.908.jar:2.1.11.908] ... 14 common frames omitted INFO 20:58:34 DSE shutting down... INFO 20:58:34 All plugins are stopped.
Upvotes: 0
Views: 972
Reputation: 33
Modify cassandra policy in cassandra.yaml on failed nodes.
1) disk failure policy to best_effort 2) Start DSE start or (Cassandra service) 3) nodetool scrub
Upvotes: 0
Reputation: 31
Did you check if some disk failure caused the SSTables to get corrupted? That is one of the main reason for stable corruption. If it is the case repair the disk and then run nodetool repair.
Upvotes: 0