user1523567
user1523567

Reputation: 59

Using Flume file_roll sink type stuck after few minutes

I am using flume file_roll sink type to sink high volume of data (rate ~10000 events/second) via syslogTCP source type. however the process(spark streaming job) which is pushing data to syslogTCP port stuck after 15 - 20 min ingesting arrount 1.5 million events. I also observed some file descriptor issue in the linux box where flume-ng agent is running.

Below is the flume configuration i am using:

agent2.sources = r1
agent2.channels = c1
agent2.sinks = f1

agent2.sources.r1.type = syslogtcp
agent2.sources.r1.bind = i-170d29de.aws.amgen.com
agent2.sources.r1.port = 44442

agent2.channels.c1.type = memory
agent2.channels.c1.capacity = 1000000000
agent2.channels.c1.transactionCapacity = 40000

agent2.sinks.f1.type = file_roll
agent2.sinks.f1.sink.directory = /opt/app/svc-edl-ops-ngmp-dev/rdas/flume_output
agent2.sinks.f1.sink.rollInterval = 300
agent2.sinks.f1.sink.rollSize = 104857600
agent2.sinks.f1.sink.rollCount = 0

agent2.sources.r1.channels = c1
agent2.sinks.f1.channel = c1

because of performance issue mainly because of high ingestion rate I cannot use HDFS sink type.:

Upvotes: 0

Views: 203

Answers (1)

user1523567
user1523567

Reputation: 59

This was my bad. I was using console logging and at some point The putty terminal was freezing because of connectivity issue. causing entire flume agent to chock. By redirecting flume console output OR having a log4j.property which writes output to console has resolved the freezing issue.

Upvotes: 0

Related Questions