Paul
Paul

Reputation: 189

Apache pulsar infinite retention

In Apache Pulsar topic documentation it says can we set a topic time retention policy to -1 for infinite time based retention, What are the downsides of having infinite retention and can we use pulsar as message store where data lives forever in topics and build event sourcing application around them?

Upvotes: 4

Views: 878

Answers (3)

Andrew Orlosky
Andrew Orlosky

Reputation: 1

Using the benefits to Pulsar is a better option because it provides more organization for your data store. Since Pulsar's strength is a storage layer that separates tiered storage away from topics, I would recommend going that route because your data will both me more secure and easily accessible.

Upvotes: 0

David Kjerrumgaard
David Kjerrumgaard

Reputation: 1076

Actually, you can and should use Pulsar's Tiered Storage option to offload your older data to more cost effective storage such as S3, Google Blob Storage, or HDFS. Unlike Kafka, Pulsar has decoupled the serving layers from the storage layers, which allows this. In Kafka, you would have to "add hard drives endlessly" and broker instances to store them.

Upvotes: 4

Ivan Kelly
Ivan Kelly

Reputation: 204

The downside is that your data will grow forever. However, due to the segment based architecture of the underlying storage (bookkeeper), more space can by added by adding storage nodes (i.e. all the data doesn't have to fit on one machine, as is the case in some other systems).

The segment based architecture also makes it fairly straightforward to move data to a bulk storage system (s3 or something) while still having it available from Pulsar. However, this is still in earlier stages of discussion right now.

Upvotes: 9

Related Questions