Reputation: 189
In Apache Pulsar topic documentation it says can we set a topic time retention policy to -1 for infinite time based retention, What are the downsides of having infinite retention and can we use pulsar as message store where data lives forever in topics and build event sourcing application around them?
Upvotes: 4
Views: 878
Reputation: 1
Using the benefits to Pulsar is a better option because it provides more organization for your data store. Since Pulsar's strength is a storage layer that separates tiered storage away from topics, I would recommend going that route because your data will both me more secure and easily accessible.
Upvotes: 0
Reputation: 1076
Actually, you can and should use Pulsar's Tiered Storage option to offload your older data to more cost effective storage such as S3, Google Blob Storage, or HDFS. Unlike Kafka, Pulsar has decoupled the serving layers from the storage layers, which allows this. In Kafka, you would have to "add hard drives endlessly" and broker instances to store them.
Upvotes: 4
Reputation: 204
The downside is that your data will grow forever. However, due to the segment based architecture of the underlying storage (bookkeeper), more space can by added by adding storage nodes (i.e. all the data doesn't have to fit on one machine, as is the case in some other systems).
The segment based architecture also makes it fairly straightforward to move data to a bulk storage system (s3 or something) while still having it available from Pulsar. However, this is still in earlier stages of discussion right now.
Upvotes: 9