icdevppl
icdevppl

Reputation: 175

Recovering Kafka Data from .log Files

I have a 1-node kafka that crashed recently. I was able to salvage the .log and .index files from /tmp/kafka-logs/mytopic-0/ and I have moved these files to a different server and installed kafka on it.

Is there a way to have the new kafka server serve the data contained in these .log files?

Update:

I probably didn't do this the right way, but here is what I've tired:

  1. created a topic named recovermytopic on the new kafka server stopped kafka
  2. moved all the .log files into /tmp/kafka-logs/recovermytopic-0
  3. restarted kafka
  4. it appeared that for each .log file, kafka generated a .index file, looked promising but after the index files were created, I saw messeages below:

    WARN Partition [recovermytopic,0] on broker 0: No checkpointed highwatermark is found for partition [recovermytopic,0] (kafka.cluster.Partition)  
    INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions [recovermytopic,0] (kafka.server.ReplicaFetcherManager)
    

When I try to check the topic using kafka-console-consumer, the kafka server says:

INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor)

no messages being consumed..

Upvotes: 3

Views: 3307

Answers (1)

Mansoor Siddiqui
Mansoor Siddiqui

Reputation: 21693

Kafka comes packaged with a DumpLogSegments tool that will extract messages (along with offsets, etc.) from Kafka data log files:

$KAFKA_HOME/bin/kafka-run-class.sh kafka.tools.DumpLogSegments --deep-iteration --print-data-log --files mytopic-0/00000000000000132285.log > 00000000000000132285_messages.out

The output will vary a bit depending on which version of Kafka you're using, but it should be easy to extract the message keys and values with the use of sed or some other tool. The messages can then be replayed into your Kafka cluster using the kafka-console-producer.sh tool, or programmatically.

While this method is a bit roundabout, I think it's more transparent/reliable than trying to get a broker to start with data log files obtained from somewhere else. I've tested the DumpLogSegments tool with various versions of Kafka from 0.9 all the way up to 2.4.

Upvotes: 2

Related Questions