anonymous123
anonymous123

Reputation: 1285

Apache Flume sampling rate

Is it possible to specify a sampling rate to Flume before the records get written to HDFS? Is there some flume sink config for doing that or do we need to write our own Flume interceptor for sampling? I could not find any documentation on the Apache Flume user guide page.

Upvotes: 0

Views: 97

Answers (1)

Erik Schmiegelow
Erik Schmiegelow

Reputation: 2759

Yes you can achieve that by specifying batch sizes in hdfs sink:

hdfs.batchSize = 100 // 100 is the default.

You should also make sure that you specify a channel capacity that's large enough, too.

Upvotes: 1

Related Questions